open source

Simplifying Kubernetes With Docker Compose and Friends

Today we’re happy to announce we’re open sourcing our support for using Docker Compose on Kubernetes. We’ve had this capability in Docker Enterprise for a little while but as of today, you will be able to use this on any Kubernetes cluster you choose.

Why Do I Need Compose If I Already Have Kubernetes?

The Kubernetes API is really quite large. There are more than 50 first-class objects in the latest release, from Pods and Deployments to ValidatingWebhookConfiguration and ResourceQuota. This can lead to a verbosity in configuration, which then needs to be managed by you, the developer. Let’s look at a concrete example of that.

Original Link

Evolvement of Kubernetes to Manage Diverse IT Workloads

Kubernetes started in 2014. For the next two years, the adoption of Kubernetes as a container orchestration engine was slow but steady, as compared to its counterparts – Amazon ECS, Apache Mesos, Docker Swarm, GCE, etc. After 2016, Kubernetes started creeping into many IT systems that have a wide variety of container workloads and demand higher performance for scheduling, scaling and automation. This is to enable a cloud-native approach having a microservices architecture in application deployments. Leading tech giants (AWS, Alibaba, Microsoft Azure, Red Hat) have started new solutions based on Kubernetes and in 2018, they are consolidating to build a de facto Kubernetes solution which can cover every use case that handles dynamic hyperscale workloads.

Two very recent acquisitions depict how Kubernetes has created a huge impact in the IT ecosystem. One is IBM’s Red Hat and VMware’s Heptio acquisition. IBM did not show the direct interest to target container orchestrations but had eyes on Red Hat’s Kubernetes Based Openshift.

Original Link

Kerberos Authenticator for Apache Cassandra

Coming right on the heels of announcing an open source LDAP authenticator for Apache Cassandra, we’re proud to now release an open source Kerberos authenticator. The new project makes Kerberos’ celebrated single sign-on and secure authentication capabilities available to all Apache Cassandra users.

For developers with an existing Kerberos environment, it’ll be pretty darn simple to configure a cluster to use the Kerberos authenticator. But you’ll first need to have a DNS server and a working Kerberos Key Distribution Center (KDC), and to issue each Cassandra node a Kerberos service principal, forward DNS record, and reverse DNS record.

Original Link

Running Imply Druid Distribution Inside Docker Container

Druid is an open-source data store designed for sub-second queries on real-time and historical data. Druid can scale to store trillion of events and ingest millions of events per second. Druid is best used to power user-facing data applications.

Imply is an analytics solution powered by druid. The Imply Analytics platform includes Druid bundled with all its dependencies, an exploratory analytics UI, and a SQL layer. It also provides additional tools and scripts for easy management of druid nodes.

Original Link

What the Heck Is Time-Series Data (And Why Do I Need a Time-Series Database)?

This is a primer on time-series data, and why you may not want to use a “normal” database to store it.

Here’s a riddle: what do self-driving Teslas, autonomous Wall Street trading algorithms, smart homes, transportation networks that fulfill lightning-fast same-day deliveries, and an open-data-publishing NYPD have in common?

Original Link

Amazon Corretto: Another No-Cost JDK

In the blog post "A Tale of Two Oracle JDKs," I compared and contrasted the two JDKs provided by Oracle: Oracle OpenJDK and Oracle JDK (Java SE). There are numerous other JDK offerings available, most of which are based on OpenJDK. One of these is Amazon Corretto, which is the subject of this post.

Today’s "What’s New with AWS" post "Introducing Amazon Corretto (Preview)" announces Amazon Corretto as "a no-cost, multiplatform, production-ready distribution of the Open Java Development Kit (OpenJDK)." Amazon Corretto‘s main page describes what it has to offer: "Corretto comes with long-term support that will include performance enhancements and security fixes." This is significant because it advertises that Amazon will provide performance enhancements and security fixes to its JDK offerings past the six months in which Oracle has committed to making performance enhancements and security fixes to each new OpenJDK version.

Original Link

Why and How to Use Git LFS

Although Git is well known as a version control system, the use of Git LFS (Large File Storage) is often unknown to Git users. In this post I will try to explain why and when Git LFS should be used and how to use it. The source code of this post can be found on GitHub.

What Is It?

Git LFS is an open-source project and is an extension to Git. The goal is to work more efficiently with large files and binary files into your repository.

Original Link

Buy or Build

One question that I hear a lot of people asking is whether they should buy or build the software that they need to run their enterprise. This is often a difficult question to answer. One thing I can say having lived through many major software purchases is that the main cost was understanding a system that was purchased and that cost of understanding turned out to be far greater than expected. Large, expensive software products are often more costly to purchase then the price tag would indicate.

If you can gain a competitive edge through embodying it in software then it’s almost always better to build rather than to buy. This applies to entire products. Technology companies are sometimes purchased because of their customer base or inroads into a particular market segment or any number of other reasons. Of course, I’m not talking about off-the-shelf software. I’m talking about integrating essential components that were not developed in-house.

Original Link

Introduction to Selenium Automation Testing

In an era of extremely interactive and responsive software processes where several enterprises are using some form of Agile methodology, automation testing has become crucial for many software projects. Automation testing beats manual one all the time as it requires less time and human resource has a lower risk for errors, allows regular execution, supports lights out the execution, regression testing and also functional testing. There are many commercial and open source tools available for supporting the growth of automation testing. Specifically, Selenium is one of the most widely-used tools to build test automation for web applications.

1. What Is Selenium Testing?

Selenium introduction

Original Link

How to Build Hybrid Cloud Confidence

Software complexity has grown dramatically over the past decade, and enterprises are looking to hybrid cloud technologies to help power their applications and critical DevOps pipelines. But with so many moving pieces, how can you gain confidence in your hybrid cloud investment?

The hybrid cloud is not a new concept. Way back in 2010, AppDynamics founder Jyoti Bansal had an interesting take on hybrid cloud. The issues Jyoti discussed more than eight years ago are just as challenging today, particularly with architectures becoming more distributed and complex. Today’s enterprises must run myriad open source and commercial products. And new projects — some game-changers — keep sprouting up for companies to adopt. Vertical technologies like container orchestrators are going through rapid evolution as well. As they garner momentum, new software platforms are emerging to take advantage of these capabilities, requiring enterprises to double down on container management strategies.

Original Link

IBM Acquires Red Hat. Now Who Does Google Buy?

IBM announced yesterday that they had entered into an agreement to acquire Red Hat for $34 billion in cash. That’s one of those big milestones in IT history that will have a profound impact for years to come. But is that totally surprising? Not quite…

At the beginning of the year, as part of my 2018 predictions, here is what I had posted on Twitter:

Original Link

IBM to Buy Red Hat for $34 Billion

Red Hat, everyone’s favorite open source software giant, will soon be under new ownership.

IBM announced on Sunday that it will pay $34 billion to acquire Red Hat and its massive portfolio of OSS. The transaction still needs approval from regulators and shareholders (the latter of whom likely won’t mind, as Red Hat’s stock prices soared 50 percent after the news broke), but the deal is on pace to close in the second half of 2019.

Original Link

Adaptive Data Integration and Operations on Oracle Cloud Using StreamSets

StreamSets is pleased to announce a new partnership with Oracle Cloud Infrastructure (OCI). As enterprises move their big data workloads to the cloud, it becomes imperative that their Data Operations are more resilient and adaptive to continue to serve the business’s needs. This is why StreamSets Data Collector™ is now easily deployable on OCI.

What led us to this point? There are fundamental questions such as ‘What good is an Enterprise Data Hub (EDH) without the most current data?’ ‘What good is the EDH without lots of data sources feeding it?’ which leads to the follow up questions of ‘How do you manage data engineering as quickly as software development in a fast-paced DevOps world?’ ‘How do you manage change-data-capture (CDC) from Oracle, streaming log files, and batch SFTP dumps without using large and confusing toolsets?’

Original Link

Bromium DSL: A DSL to Test UI Actions [Video]

I recently heard about Bromium, a Domain Specific Language (DSL) to describe user actions on a UI. Given I love everything regarding DSLs I contacted his author, Hristo Vrigazov, and chatted with him about:

  • The DSL he has built,
  • Building editors using Xtext,
  • Working on an open-source project
  • His visions on language engineering


Original Link

A Cambrian Explosion of DevOps Tools

Any discussion of how to scale the benefits of DevOps invariably lands on tools. The planning, tracking, automation, and management tools we use define the "ground truth" of where and how work happens. One of the most interesting, and at times challenging, aspects of agile and DevOps transformations is the sheer volume of tools involved. How many are required? Must there be so many? Before we proceed further on our journey of defining value stream architecture, let’s look at how this ground truth has evolved to get us where we are today.

The Catalyst for DevOps Tool Diversification

We’re at an interesting time in the evolution of DevOps tools; the sheer number of available tools points to a sort of Cambrian explosion of tool specialization and diversity. Is all this diversity necessary? Will a big wave of consolidation drive the extinction of most of these tools? What are the lines of specialization driving the diversity, and do we need to consider them when architecting our software value streams? We need to address these questions and inspect the ground truth captured in today’s toolchains in order to inform the discussion of how to abstract away the tools’ implementation details to focus on the architecture of our value streams.

Original Link

Validation in Java Applications

Often, I have seen projects that didn’t appear to have any conscious strategy for data validation. Their teams worked under the great pressure of deadlines, unclear requirements, and just didn’t have enough time to make validation in a proper and consistent way. So, data validation code could be found everywhere — in Javascript snippets, Java screen controllers, business logic beans, domain model entities, database constraints, and triggers. This code was full of if-else statements, throwing different unchecked exceptions, and making it hard to find a place where data could be validated. So, after a while, when the project grew up enough, it became quite hard and expensive to keep these validations consistent and following requirements, which, as I’ve said, are often fuzzy.

Is there a path for data validation in an elegant, standard, and concise way? Is there a way that doesn’t fall into unreadability, helps us to keep most of the data validation logic together, and has most of the code already done for us by developers of popular Java frameworks?

Original Link

Who’s Afraid of the Big, Bad Hybrid Cloud?

This article is featured in the new DZone Guide to Cloud: Serverless, Functions, and Multi-Cloud. Get your free copy for more insightful articles, industry statistics, and more!

Cloud management is a key aspect that organizations are looking at on their journey to becoming a software-driven enterprise in order to simplify operations, increase IT efficiency, and reduce data center costs.

Original Link

New Cloud-Native Services and SAP Cloud Platform, ABAP Environment

Great meeting and speaking with Dan Lahl, Vice President of Product Marketing at SAP SE  about their new cloud-native development, deployment, and lifecycle management capabilities in SAP Cloud Platform.

These new features empower enterprises and developers with more business-focused tools to succeed in the digital economy. SAP Cloud Platform offers customers a smooth, fast and easy transition to the cloud and opens up limitless innovation potential.

Original Link

Using Java 11 In Production: Important Things To Know

If you stay up to date on news from the Java community, you may have heard that Oracle changed their support model for Java. Some rumors even suggest that we now have to pay to use Java — this is not true!

This topic is quite a complex one since there are a number of overlapping changes that have come together since the release of Java 8. The new, six-monthly release cadence and Oracle’s changes in licensing and support model mean that any organization that deploys a Java application should take this opportunity to look at:

Original Link

A Comparison of Kubernetes Distributions

Kubernetes is currently one of the most successful and fastest growing IT infrastructure projects. Kubernetes was introduced in 2014 as an open source version of the internal Google orchestrator Borg. 2017 saw an increase in Kubernetes adoption by enterprise and by 2018, it has become widely adopted across diverse businesses, from software developers to airline companies. One of the reasons why Kubernetes gained popularity so fast is its open source architecture and an incredible number of manuals, articles and support provided by its loyal community.

No wonder that just as for any successful open source project, several distributions can be found in the market (think Linux here), offering various extra features and targeting a specific category of users.

Original Link

Java 11 and IntelliJ IDEA

Java 11 was just released! It feels like only yesterday that we were saying the same thing about Java 9. This new, six-monthly release cadence is a big change for the Java community. Java developers are getting small drops of interesting new features regularly, which is exciting.

Java 11

Java 11, like Java 10, has a fairly short list of new features, which is a good thing for us developers, as it’s much easier to see what may be interesting and useful to us. From an IntelliJ IDEA point of view, there’s really only one feature that benefited from some extra support in the IDE, and that was JEP 323: Local-Variable Syntax for Lambda Parameters. We’ve already blogged about this in the context of Java 11 support in IntelliJ IDEA 2018.2, but let’s cover it again quickly.

Original Link

Use Docker Instead of Kubernetes

Today we are all talking about containers and container-based infrastructure.  But what is this container technology? And how does it solve today problems?

I am using containers myself and of course, I am fascinated by this server technology. Containers can really simplify things. After more than 20 years in building server applications, I have experienced many problems very closely.

Original Link

Career Planning: Letter to a Young Developer

So, we had a request for advice show up this week, and it struck me as a good topic for our readers:

Young Developer: Hello, Alex, I am currently a senior in high school and want to pursue a career in mobile development after high school.

Original Link

Getting to Know Spinnaker [Webinar]

Spinnaker is an open-source, multi-cloud continuous delivery platform for releasing software with high velocity and confidence. It’s a CI/CD tool that is widely used by developers, and adoption is growing. Watch this webinar to get Spinnaker’s history, use cases, and more for a basic understanding of why Spinnaker is the most powerful CD tool on the market today.



Original Link

Free Java Production Support on Microsoft Azure and Azure Stack

Thanks to Scott Sellers, President and CEO at Azul Systems, for taking me through their collaboration with Microsoft to make fully compatible and compliant commercial builds of Java SE available for Java developers on Microsoft Azure.

Azul Systems will provide fully-supported Zulu Enterprise builds of OpenJDK for Azure for all long-term support (LTS) versions of Java, starting with Java SE 7, 8, and 11

Original Link

PouchContainer RingBuffer Log Practices

PouchContainer is an open-source container technology of Alibaba, which helps enterprises containerize the existing services and enables reliable isolation. PouchContainer is committed to offering new and reliable container technology. Apart from managing service life cycles, PouchContainer is also used to collect logs. This article describes the log data streams of PouchContainer, analyzes the reasons for introducing the non-blocking log buffer, and illustrates the practices of the non-blocking log buffer in Golang.

PouchContainer Log Data Streams

Currently, PouchContainer creates and starts a container using Containerd. The modules involved are shown in the following figure. Without the communication feature of a daemon, a runtime is like a process. To better manage a runtime, the Shim service is introduced between Containerd and Runtime. The Shim service not only manages the life cycle of a runtime but also forwards the standard input/output data of a runtime, namely log data generated by a container.

Original Link

AWS vs. Azure

Since the day of its inception, the cloud computing technology has been revolutionary for its wide appeal and technical prowess. In the cloud computing realm, there are two invincible forces that are becoming a definitive authority on this innovative technical achievement — Microsoft Azure and Amazon Web Services.

Although both Azure and AWS are fundamentally the same in their work mechanisms, there are some notably distinctive aspects. Here we are going to dig deep those differences.

Original Link

Introducing the Open Hybrid Architecture Initiative

The concept of a modern data architecture has evolved dramatically over the past 10-plus years. Turn the clock back and recall the days of legacy data architectures, which had many constraints. Storage was expensive and had associated hardware costs. Compute often involved appliances and more hardware investments. Networks were expensive, deployments were only on-premises, and proprietary software and hardware were locking in enterprises everywhere you turned.

This was (and for many organizations still is) a world of transactional silos where the architecture only allowed for post-transactional analytics of highly structured data. The weaknesses in these legacy architectures were exposed with the advent of new data types such as mobile and sensors, and new analytics such as machine learning and data science. Couple that with the advent of cloud computing and you have a perfect storm.

Original Link

Why Python 3 Is the Python You Always Wanted

Back when Python was just an idea in Guido van Rossum’s mind, the World Wide Web was just being born, APIs were proprietary (if your application had one at all), and everyone was doing scripting in a brand new language called Perl.

Much has changed since then, including the Python language your dad or mom may have used.

Original Link

Productionizing Data Science With the KNIME Server [Video]

Moving one step further to now put these data science applications into production, a number of requirements need to be taken into account.

Original Link

Top 10 Kafka Features: Reasons Behind the Popularity of Apache Kafka

1. Objective

Today, we will discuss all the features of Kafka, like scalability, reliability, durability, that show why Kafka is so popular. We will discuss each features in detail. But before that, let’s understand what Kafka is. 

2. What Is Apache Kafka?

Apache Kafka is a distributed publish-subscribe messaging system for handling a high volume of data that enables us to pass messages from one end-point to another. It is suitable for both offline and online message consumption. Moreover, in order to prevent data loss, Kafka messages are persisted on disk and replicated within the cluster. In addition, it is built on top of the ZooKeeper synchronization service. While it comes to real-time streaming data analysis, it can also integrate very well with Apache Storm and Spark. There are many more features of Apache Kafka. Let’s discuss them in detail.

Original Link

Understanding the Kubelet Core Execution Frame

Kubelet is the node agent in a Kubernetes cluster, and is responsible for the Pod lifecycle management on the local node. Kubelet first obtains the Pod configurations assigned to the local node, and then invokes the bottom-layer container runtime, such as Docker or PouchContainer, based on the obtained configurations to create Pods. Then Kubelet monitors the Pods, ensuring that all Pods on the node run in the expected state. This article analyzes the previous process using the Kubelet source code.

Obtaining Pod Configurations

Kubelet can obtain Pod configurations required by the local node in multiple ways. The most important way is Apiserver. Kubelet can also obtain the Pod configurations by specifying the file directory or accessing the specified HTTP port. Kubelet periodically accesses the directory or HTTP port to obtain Pod configuration updates and adjust the Pod running status on the local node.

Original Link

Mainframe Testing With BlazeMeter

You can now test your mainframe server with Apache JMeter™. The brand new RTE plugin, developed by BlazeMeter, enables emulating actions and keystrokes to IBM mainframes and servers, over protocols TN5250 and TN3270.

Learn how to use the RTE plugin in JMeter from this blog post.

Original Link

Git Strategies for Software Development: Part 1

Git is a version control system for tracking changes in files and coordinating work on those files among multiple people. It is primarily used for source code management in software development. It is a distributed revision control system and is very useful to support software development workflows.

The Git directory on every machine is a full repository which has full version tracking capabilities and independent of network access. You can maintain branches, perform merges, and continue with development even when you are not connected to the network. For me, having a full repository on my machine and ease of use (creating a branch, merging branches, and maintaining branching workflows) are the biggest advantages of Git. It is a free and open-source software distributed under the terms of the GNU.

Original Link

Shift Developer Conference 2018 — How to Jump Start a Career in Open Source (Video)

As previously posted, I spent this week at the largest developer conference in Southeast Europe, known as the Shift Developer Conference 2018.

I gave a talk on the soft skill side of development, suggesting some ways to jump-start a career in open source. I did not mention coding, pull requests or even suggest to join a coding project. It’s more subtle than the obvious components one would expect in such a topic.

Original Link

Five Can’t-Miss Sessions from DevOps World | Jenkins World 2018

It’s that time of year! DevOps World | Jenkins World is right around the corner and with a new name and an added location – Nice, France – it’s shaping up to be bigger and better than ever. This year’s event in San Francisco from September 16 -19, will host more than 70 sessions covering a variety of DevOps and Jenkins topics from security, pipeline automation and containers to DevOps adoption and continuous delivery best practices.

In addition to keynotes from Kohsuke Kawaguchi, the creator of Jenkins and CTO at CloudBees, Sacha Labourey, CloudBees’ CEO and co-founder and Dr. Nicole Forsgren, CEO and chief scientist of DORA, there will be a number of sessions conducted by industry practitioners.

Original Link

What Is Kubernetes?

If you have been following the latest trends in IT, DevOps and software development, you have probably heard about Kubernetes. Most likely you have seen Kubernetes described as a highly-customizable open source platform for managing containerized workloads and services. This is true, but the terminology could be improved. Let’s dive deeper…

Kubernetes Open Source Project – The Kernel of Kubernetes

When you look at the Kubernetes open source project, you are not going to find source code and installation instructions for a platform as advertised. Instead, you’ll find source code for a pluggable and highly customizable framework that may be used together with other components to build a platform (as advertised). This is the "Kernel" of Kubernetes.

Original Link

Cloud Management: The Good, The Bad, and The Ugly (Part 3): Getting CMPs Right!

See Part 1 and Part 2 of this series.

Cloud Management is one of the key areas that CTOs and CIOs need to make investments in in the coming years. In the previous 2 posts in this series, we discussed the challenges with existing cloud management platforms and the 5 key capabilities of CMPs. In this last part, I’d like to discuss the four key principles around which Platform9 has designed our industry-leading offering in this space, and why we provide an alternative to traditional CMPs that have, unfortunately, given hybrid clouds a bad name.

Original Link

Compendium of Practice Testing Apps — Version 1.2

To help you practice your testing, I have The Evil Tester’s Compendium of Testing Apps. One download = lots of apps to practice testing on.

Version 1.2

Version 1.2 of the Evil Tester’s Compendium of Testing Apps has…

Original Link

How to Install Jenkins on the Apache Tomcat Server

Jenkins is a powerful open source tool that enables you to automate tests and deployment. Apache Tomcat is a powerful servlet Java container for running web applications. If you are running your apps in Tomcat, or wish to do so, you might also want to run Jenkins in it. This blog post will explain how to do it.

If you are looking to install Jenkins in other ways, read how to install Jenkins on Windows, Ubuntu and with a WAR file.

Original Link

How to Easily Do Accessibility Testing in Continuous Integration

How to Add Accessibility Tests to Your CI Pipeline Using the Open-Source Tool, Pa11y

If you are unfamiliar with continuous integration (CI), it’s a practice wherein each developer’s code is merged frequently (at least once per day). In this way, a stable code repository is maintained from which anyone can start working on a change. The build is automated with various automatic checks, such as code quality reviews, execution of unit tests, etc.

Of the tests that can be run in a continuous integration environment, we will be looking at accessibility tests. Accessibility testing, according to Guru99, is a subset of usability testing, and it’s performed to ensure that the application being tested is usable by people with disabilities like the deaf, blind, and color blind as well as other disadvantaged groups.

Original Link

The 2018 DevOps Pulse Survey Results

Who Took This Survey?

This year, over 1,000 IT pros took this survey coming from all over the world. Over 50% have over 10 years of experience under their belt.

Key Findings: Security

  • DevOps professionals are taking on security on behalf of their organizations. 54% of respondents shared that their DevOps department handles security incidents in their organizations. Only 41% employ dedicated security operations personnel.
  • Despite this, most DevOps professionals are ill-equipped to handle security. 76% of those surveyed either do not practice DevSecOps or are still in the process of implementation. 71% do not feel their team has adequate knowledge of DevSecOps best practices.
  • The security skills gap is a real concern. Half the organizations surveyed have trouble finding the talent to fill roles on their security analyst teams.
  • Despite being two months into GDPR, a significant number of organizations are not yet ready. 39% are not yet GDPR ready or are still working on it.
  • Organizations are not adopting SIEM tools as quickly as expected. Only 29% of respondents have a SIEM system in place and ELK is the most popular SIEM tool.
  • DDoS is the most feared form of cyber attack. Concern over DDoS is more than double any other type of security incident.

Security Talent and Implementation

54% of organizations use DevOps personnel to handle security incidents. This is closely followed by specialized Security Operations.

Original Link

Using the behave Framework for Selenium BDD Testing: A Tutorial

Let’s say you have a task to automate the testing of an application. Where should you start? The first step is to choose an approach to test automation, which will be the basis for your test development. When you are searching for possible options, you will find out that there are many of them, like unit testing, test-driven development, keyword-driven development, behavior-driven development and so on. In this article, we are going to talk about one of the most popular approaches to test automation – BDD or behavior-driven development. Follow the examples here on GitHub.

Explaining BDD

I suspect you might have a question here: “There is nothing about testing in the technique’s name, so how it can be used for testing?”. BDD originates from the test-driven development technique (TDD). This technique defines that before any functionality is implemented, tests should be created first. Usually, TDD is useful for short-term iterations when you need to keep your functionality safe from regression for a single unit that is under development.

But what about integration with other modules? Integration tests are more complex and require more knowledge and time to implement them. As this point when we need to turn our focus to BDD, where instead of module tests, behavior tests are automated.

What are considered as “behavior tests”? Behavior tests come from specification and business requirements. Business stakeholders, QA engineers, analysts, application and test developers work together to identify the correct flow and test it. With this approach, every new requirement and functionality can be added so they are covered by tests in the same iteration. Seems promising!

BDD Scenarios in Gherkin

Let’s have a look at BDD in action. In python, the behave framework is a great implementation of that technique. Scenarios in behave are written using the Gherkin syntax. A simple scenario in Gherkin looks like this:

Feature: User authorization and authorization
Scenario: The user can log into the system
Given The user is registered in the system
When The user enters a login
And enters a password
Then the user is logged in
  • Feature keyword — describes which part of the functionality scenarios are being created for.
  • Scenario keyword — is used to describe the test case title.
  • Given keyword — describes pre-conditions required to complete the scenario.
  • When keyword — is used to describe the scenario’s steps.
  • Then keyword — describes the expected result of the scenario.
  • And keyword — can be used for Given, When and Then keywords to describe additional steps.

Our BDD Scenario

Let’s automate a scenario for and verify if the user can find flights from Paris to London. The scenario will look like this:

Feature: The user can book available flights Scenario: The user can find a flight from Paris to London
Given the user is on the search page
When the user selects Paris as a departure city
And the user selects London as a destination city
And clicks on the Find Flights button
Then flights are presented on the search result page

To automate the test, we will need:

  1. Python 2.7.14 or above. You can download it from here. There are two major versions of python nowadays: 2.7.14 and 3.6.4. Code snippets in the blog post will be given for version 2.7.14. If there is any difference for version 3.6.4, a note will be made with appropriate changes to version 3.6.4. It’s up to the reader to choose which version to install.

  2. To install python package manager (pip). It can be downloaded from its download page. All further installations in the blog post will make use of pip so it’s highly recommended to install it.

  3. A development environment. The PyCharm Community edition will be used in this blog post. You can download it from the Pycharm website. You can use any IDE of your choice since code snippets are not IDE dependent.

Now, let’s create our project.

Create the project in PyCharm IDE with File -> New Project.

Specify the location for the project (the last part of the path will be the project’s name).

When developing a python application, it’s a good practice to isolate its dependencies from others. By doing this, you can be sure that you are using the right version of the library in case there are multiple versions of it in your PYTHON_PATH. (The PYTHON_PATH variable is used in python applications to find any dependency that is declared via import statement in python modules).

To do this, you need to install the virtual environment.

Install the Virtual Environment tool with the command pip install virtualenv in the prompt.

In the project root directory, create the environment with virtualenv BLAZEDEMO in the prompt where BLAZEDEMO is the environment’s name.

You will notice that in the project root you have a new directory created – BLAZEDEMO. This is the folder of your virtual environment. Once activated, the python interpreter will be used to switch to the one available in the virtual environment. Also, all packages and dependencies will be installed within the virtual environment and will not be available outside of it.

To activate it, in POSIX systems run the source bin/activate command from the environment root folder. In Windows systems, go to environment folder -> Scripts and execute activate.bat from the prompt.

If the environment is activated, your prompt will be prefixed with the environment’s name as below:

Now, we can install all the packages that are required to automate our first BDD scenario.

Install ‘behave’ by executing pip install behave in the prompt.

Since our application is a web application, we will need a tool that helps us interact with the Graphical User Interface. Selenium will make a perfect match here since it enables interacting with any element on a web page just like a user would: clicking on buttons, typing into text fields and so on. Additionally, Selenium has wide support of all popular browsers, and good and full documentation, which make it an easy-to-use tool. To install it, just execute ‘pip install selenium’ in the prompt.

Creating Our Selenium Scenario in behave

In behave, all scenarios are kept in feature files that should be put in the features directory.

Let’s create flight_search.feature and put the scenario we created earlier into it.

All scenario steps must have an implementation. But we need to take care of a few things before:

First, define a browser to run scenarios in. As Selenium only defines the interface to interact with a browser and a few useful tools, the actual implementation is performed by WebDriver. Each browser has its own implementation of WebDriver. Download all the implementations you need from the download page, i.e. for Chrome browser you will need ChromeDriver. Select the appropriate version depending on your operating system.

Waiting for Elements With WebDriverWait

Once WebDriver is downloaded, we need to define the way to access it. For web applications, there is one thing we need to take care of: since elements are not loaded simultaneously, we need to wait until they become available. Web driver itself doesn’t wait for elements to be accessible – it tries to get them immediately. To bypass this, Selenium has a WebDriverWait class that explicitly waits for elements by different conditions. Create a new python directory web and add there.

Here is the script for waiting for elements:

from import By
from import WebDriverWait
from import expected_conditions as EC from import By
from import WebDriverWait
from import expected_conditions as EC class Web(object): __TIMEOUT = 10 def __init__(self, web_driver): super(Web, self).__init__() # in python 3.6 you can just call super().__init__() self._web_driver_wait = WebDriverWait(web_driver, Web.__TIMEOUT) self._web_driver = web_driver def open(self, url): self._web_driver.get(url) def find_by_xpath(self, xpath): return self._web_driver_wait.until(EC.visibility_of_element_located((By.XPATH, xpath))) def finds_by_xpath(self, xpath): return self._web_driver_wait.until(EC.presence_of_all_elements_located((By.XPATH, xpath)))

In the constructor, there is the instance variable self._web_driver_wait that references to the instance of the WebDriverWait class. In the method find_by_xpath we use self._web_driver_wait to wait until the element, which can be found by xpath, becomes visible on a web page. The same goes for the method finds_by_xpath with the difference that it searches for multiple elements to be present on a web page. The method “open” simply opens a web page by URL.

Browser Initialization

As the final step, we need to take care of when a Web instance should be initialized. We need to remember that with every instance of the web driver there is a new instance of the corresponding web browser. I believe you don’t want to start a new browser every time for every single test; except rare cases when you need to have a clean browser, without cache or any local stored data. This behavior can be achieved in two ways:

  1. With the usage of fixtures. If you are familiar with a testing framework like pytest, you already know what fixtures are. Fixtures are functions, whose main purpose is to execute initialization, configuration and cleanup code.
  2. With environment functions in the file. Within the environment, methods can be executed before and after each test, scenario, feature, or tag, or the whole execution.

To understand how these two approaches work, we need to be familiar with one important behave feature – Context. Think about a context as an object that can store any user-defined data, together with behave-defined data, in context attributes.

Depending on the level of execution: feature, scenario or test, behave will add specific attributes to context, such as feature, store the currently executed feature, scenario, store the currently executed scenario and so on. Also, you can access the result of a test, test parameters, and the environment. You can browse all the possible behave-based attributes in Context in the behave API documentation.

Browser Initialization With Fixtures

Let’s start with fixture implementation in the file

from selenium import webdriver
from web_driver.web import Web def browser_chrome(context, timeout=30, **kwargs): browser = webdriver.Chrome("C:/chromedriver.exe") web = Web(browser) context.web = web yield context.web browser.quit()

Here, we have browser_chrome(context, timeout=30, **kwargs) function that will be used as a fixture to start a web browser.

This function starts a webdriver instance that starts a Chrome browser. Then we create the instance of the Web class to access web elements on web pages. Later, we create a new attribute in context – web, that can be referenced further in our steps implementations to access the Web instance. By using yield and browser.quit() in the next line, we are making sure that the browser will be closed after all the tests that use browser_chrome are complete.

Let’s see this in action and provide the implementation for the steps in the scenario we defined earlier in the flight_search.feature file. In the features directory of the project create a new python directory, steps, and add there.

from behave import given, when, then
from behave.log_capture import capture @given("the user is on search page")
def user_on_search_page(context):"") @when("user selects Paris as departure city")
def user_select_departure_city(context): context.web.find_by_xpath("//select[@name='fromPort']/option[text()='Paris']").click() @when("user selects London as destination city")
def user_select_destination_city(context): context.web.find_by_xpath("//select[@name='toPort']/option[text()='London']").click() @when("clicks on Find Flights button")
def user_clicks_on_find_flights(context): context.web.find_by_xpath("//input[@type='submit']").click() @then("flights are found")
def flights_are_found(context): elements = context.web.finds_by_xpath("//table/tbody/tr") assert len(elements) > 1

Every step that is mentioned in the flight_search.feature file has an implementation here. In every step, we reference a web instance through the context. This is a convenient way to access a shared resource without taking care of the way it was initialized, isn’t it?

To apply a fixture, we need to define the behavior for the before_tag option in the features/ file.

from behave import use_fixture from fixtures import browser_chrome def before_tag(context, tag): if tag == "": use_fixture(browser_chrome, context)

As you can see, if a tag equals “” then we execute the browser_chrome fixture by calling the use_fixture method. To try it out, in your project root directory, execute “behave” in the command line. The output should look like this:

There are two main disadvantages to loading a webdriver with this fixtures approach:

  1. Fixtures have a scope that is defined by the scope of the tag “@fixture.*” If we are using @fixture in a scenario, then the web browser will be opened and closed for every scenario with the @fixture tag. The same happens if @fixture is applied to a feature. So, if we have multiple features, the browser will start and close for every feature. This is not good if we don’t want our features to be executed in parallel, but a great option otherwise, by the way.

  2. We need to assign a @fixture tag to every scenario or feature that is supposed to have access to a web page. This is not a good option. Moreover, what if we want to switch to another browser? Do we need to go over all the features and modify @fixture.browser every time?! This issue can be solved by applying the fixture in before_all(context) function from like this:

def before_all(context): if context.browser == "chrome": use_fixture(browser_chrome, context)

But it looks like we are searching for a solution for the issue with instruments that can be used themselves without fixtures. Let’s have a look at how.

Behave gives us the option to define configurations in configuration files that can be called either “.behaverc”, “behave.ini”, “setup.cfg” or “tox.ini” (your preference) and are located in one of three places:

  1. The current working directory (good for per-project settings),
  2. Your home directory ($HOME), or
  3. On Windows, in the %APPDATA% directory.

There are a number of configurations that are used by behave to setup your environment. But you can also use this file to identify your own environment variables, such as a browser.

Besides the browser, we have stderr_capture and stdout_capture set to False. By default those parameters are set to True, which means that behave will not print any message to the console, or any other output you specified, if the test is not failed. Setting to False will force behave to print any output even if the test passed. This is a great option if you need to see what is going on in your tests.

Browser Initialization Environment Functions

Earlier in the blog post, I mentioned there were two ways to start a webdriver. The first one is to use fixtures. We have already defined how to do it. The second option is to use environment functions. Let’s see how to do it.

We will get access to a Web instance through the before_all environment function in file. To do that, at first create the web_source/ file.

from selenium import webdriver
from web.web import Web def get_web(browser): if browser == "chrome": return Web(webdriver.Chrome("C:/chromedriver.exe"))

The code is quite simple and straightforward.

In the file for before_all(context):

from web_source.web_factory import get_web def before_all(context): web = get_web(context.config.userdata['browser']) context.web = web

Here, in the code, we are getting the currently set browser from the “browser” variable defined in behave.ini in the [behave.userdata] section.

You can try out this code by executing “behave” in the command line.

This approach is more flexible since we don’t need to modify feature files to switch to another browser.

When to Use Fixtures

But then you can ask: when should I use fixtures? Fixtures are a good option if your initialization depends on the environment you are currently on. For example, if for the development environment you would like to set up a connection to one database, and for the production environment, you would like to set up another connection. This can be achieved in the following way:

fixture_registry = {"develop": develop_database, "production": production_database} def before_tag(context, tag): if tag.startswith("environment"): use_fixture_by_tag(tag, context, fixture_registry)

For any feature and scenario tagged with @environment.develop or @environment.production, in the before_tag environment function, the appropriate fixture will be loaded and executed as defined in fixture_registry.

If you don’t know if you should be using fixtures or another approach, just ask yourself: will fixtures create more issues than they solve? If the answer is no, that you can use it. Basically, almost everything that is configurable by tags, can be managed by fixtures. If you have tag : slow, you can increase your timeout and then revert it back for fast test cases and so on.

Implementing Parameterized Steps

In the feature from the features/flight_search.feature file we saw how to create a test with static data. But what if we want to search for a flight not only from Paris to London? For such purposes, you can use parameterized steps. Let’s have a look at the implementation:

The feature file will be modified to a new one:

Feature: The user can book available flights Scenario: The user can find a flight from Paris to London Given the user is on the search page When the user selects a departure city "Paris" And the user selects a destination city "London" And clicks on the Find Flights button Then flights are present on the search result page

And the steps of implementation for choosing cities will be changed:

@when('the user select departure city "{city}"')
def user_select_departure_city(context, city): context.web.find_by_xpath("//select[@name='fromPort']/option[text()='{}']".format(city)).click() @when('the user select destination city "{city}"')
def user_select_destination_city(context, city): context.web.find_by_xpath("//select[@name='toPort']/option[text()='{}']".format(city)).click()

As you can see now the required city is loaded from parameters.

Execution Commands

So far, we have executed our features using the simple command “behave”. But the behave framework suggests different options for execution. The most useful ones:

  • –include, –exclude – to include or exclude features from test run.
  • –junit – if you would like to get a junit report. You can read more about JUnit on the official site.
  • -logging-level – to specify level of logging. By default INFO is selected, which means that all messages will be captured.
  • –logging-format – to define a format messages will be printed within.
  • –tags to filter scenarios that you would like to run. Only scenarios with the specified tag will be executed.

And many others. You can view the full list in the documentation.

Original Link

20 Useful Libraries Java Programmers Should Know

One of the traits of a good and experienced Java developer is the extensive knowledge of API, including JDK and third-party libraries. I spent a good deal of time learning API, especially after reading Effective Java 3rd Edition, where Joshua Bloch advised how to use existing APIs for development rather than writing new pieces of code for common stuff.

That advise made sense to me because of the testing exposure 2nd-party libraries get. In this article, I am going to share some of the most useful and essential libraries and APIs that a Java developer should be familiar with. However, I am not including frameworks, e.g. Spring and Hibernate, because they are pretty well known and have specific features.

In general, I am including useful libraries for day-to-day projects, including logging libraries like Log4j, JSON parsing libraries like Jackson, and unit testing APIs like JUnit and Mockito. If you need to use them in your project, then, you can either include JARs of these libraries in your project’s classpath to start using them or you can use Maven for dependency management.

When you use Maven for dependency management, it will automatically download these libraries, including the libraries they depend on, known as the transitive dependency.

For example, if you download the Spring Framework, it will also download all other JARs on which Spring is dependent, for example, Log4j.

You might not realize, but having the right version of dependent JARs is a big headache. If you have the wrong versions of the JAR, then, you will get the ClassNotFoundExceptionNoClassDefFoundError, or the UnsupportedClassVersionError.

20 Useful Open Source Libraries for Java Programmers

Here is my collection of some of the useful third-party libraries Java developers can use in their application to do a lot of useful tasks. In order to use these libraries, Java developers should be familiar with that, and this is the whole point of this article. If you have an idea then you can research about that library and use it.

1. Logging Libraries

Logging libraries are very common, because you need them in every project. They are the most important thing for server-side applications, because logs are only placed where you can see what’s going on your application. Even though JDK ships with its own logging library, there are better alternatives available, e.g. Log4j, SLF4j, and LogBack.

top Java logging libraries

A Java developer should be familiar with the pros and cons of the logging library and know why using SLF4j is better than plain Log4j. If you don’t know why, I suggest you read my earlier article on the same subject.

2. JSON Parsing libraries

In today’s world of web services and the IoT, JSON has become the go-to protocol to carry information from the client to the server. They have replaced the XML as the most preferred way to transfer information in a platform-independent way.

Unfortunately, JDK doesn’t have a JSON library. But, there are many good third-party libraries that allow you to both parse and create JSON messages, like Jackson and Gson.

A Java web developer should be familiar with at least one of these libraries. If you want to know more about Jackson and JSON, I suggest going through JSON with the Java API course from Udemy.

3. Unit Testing Libraries

Unit testing is the single most important thing that separates an average developer from a good developer. Programmers often are given excuses for not writing unit tests, but the most common excuse for avoiding unit testing is lack of experience and knowledge of popular unit testing libraries, including JUnit, Mockito, and PowerMock.

Best Unit testing libraries for Java developers

 I have a goal in 2018 to improve my knowledge of unit testing and integration testing libraries, like JUnit 5, Cucumber, Robot framework, and a few others.

I have also signed up for a JUnit and Mockito Crash Course in Udemy. Even if you know JUnit and the basics of unit testing, you may want to refresh and upgrade your own knowledge.

4. General Purpose Libraries

There are a couple of good, general purpose, third-party libraries available to Java developers, like Apache Commons and Google Guava. I always include these libraries in my projects, because they simplify a lot of tasks.

As Joshua Bloch rightly said in Effective Java, there is no point in re-inventing the wheels. We should prefer using tried and tested libraries instead of writing our own routines every now and then.

Best common libraries for Java developers

It’s good for a Java developer to get themselves familiar with Google Guava and the Apache Commons library.

5. HTTP Libraries

One thing I don’t like about JDK is their lack of support for HTTP. Though you can make an HTTP connection using classes in the java.netpackage, it’s not as easy or seamless to use open source, third-party libraries like Apache HttpClient and HttpCore.

Though JDK 9 is bringing the support of HTTP 2.0 and better support for HTTP, I strongly suggest all Java developers get familiar with popular HTTP client libraries, including HttpClient and HttpCore.

You can also check out this post What’s New in Java 9 – Modules and More to learn more about JDK 9’s HTTP 2 support.

Best HTTP libraries for Java developers

6. XML Parsing Libraries

There are many XML parsing libraries, including Xerces, JAXB, JAXP, Dom4j, and Xstream. Xerces2 is the next generation of high performance, fully compliant XML parsers in the Apache Xerces family. This new version of Xerces introduces the Xerces Native Interface (XNI), a complete framework for building parser components and configurations that is extremely modular and easy to program.

Best XML Parsing libraries for Java developers

The Apache Xerces2 parser is the reference implementation of XNI, but other parser components, configurations, and parsers can be written using the Xerces Native Interface. Dom4j is another flexible XML framework for Java applications. If you want to learn more about XML parsing in Java, I suggest you take a look at the Java Web Services and XML online course on Udemy. 

7. Excel Reading Libraries

Believe it or not — all real-world applications have to interact with Microsoft Office in some form or another. Many application needs to provide functionality to export data in Excel, and if you have to do same from your Java application, you need the Apache POI API.

Best Microsoft libraries for Java developers

This is a very rich library that allows you to both read and write XLS files from a Java program. You can see that link for a working example of reading an Excel file in a core Java application.

8. Bytecode Libraries

If you are writing a framework or libraries that generate code or interact with bytecodes, then, you need a bytecode library.

They allow you to read and modify bytecode generated by an application. Some of the popular bytecode libraries in the Java world are javassist and Cglib Nodep.

Best Bytecode manipulation libraries for Java developers

The Javassist (JAVA programming ASSISTant) makes Java bytecode manipulation very simple. It is a class library for editing bytecodes in Java. ASM is another useful bytecode editing library. If you are not familiar with bytecode, I suggest you check the Introduction to Java Programmers to learn more about it. 

9. Database Connection Pool Libraries

If you are interacting with the database from a Java application but not using database connection pool libraries, then, you are missing something.

Since creating database connections at runtime takes time and makes request processing slower, it’s always advised to use DB connection libraries. Some of the popular ones are Commons Pool and DBCP.

In a web application, it’s web server generally provides these functionalities, but in core Java applications, you need to include these connection pool libraries into your classpath to use the database connection pool.

If you want to learn more about JDBC and the connection pool in a web application, I suggest you take a look at the JSP, Servlet, and JDBC for Beginners course in Udemy.

10. Messaging Libraries 

Similar to logging and database connection, messaging is also a common feature of many real-world Java applications.

Java provides JMS, or the Java Messaging Service, that’s not part of JDK. For this component, you need to include a separate  jms.jar.

Similarly, if you are using third-party messaging protocols, like Tibco RV, then, you need to use a third-party JAR —  tibrv.jar — in your application classpath.

11. PDF Libraries

Similar to Microsoft Excel, PDF libraries are another ubiquitous format. If you need to support PDF functionality in your application, like exporting data in PDF files, you can use the iText and Apache FOP libraries.

Both provide useful PDF related functionality, but iText is richer and better. See here to learn more about iText.

Best PDF libraries for Java developers

12. Date and Time Libraries

Before Java 8, JDK’s data and time libraries have so many flaws, because they were not thread-safe, immutable, and error-prone. Many Java developers relied on JodaTime for implementing their date and time requirement.

From JDK 8, there is no reason to use Joda, because you get all that functionality in the JDK 8’s new date and time API, but if you are working in an older Java version, then JodaTime is a worth learning library.

If you want to learn more about the new date and time API, I suggest you check the What’s new in Java 8 course on Udemy. It provides a nice overview of all important features of Java 8, including the date and time API.

Image title

13. Collection Libraries

Even though JDK has a rich collection library, there are are some 3rd-party libraries that provide more options, like the Apache Commons collections, Goldman Sachs collections, Google collections, and Trove.

The Trove library is particularly useful because it provides high speed regular and primitive collections for Java.

Best embedded SQL libraries for Java developers

FastUtil is another similar API. It extends the Java Collections Framework by providing type-specific maps, sets, lists, and priority queues with a small memory footprint, fast access, and insertion; it also provides big (64-bit) arrays, sets, and lists, with fast, practical I/O classes for binary and text files.

14. Email APIs

The javax.mail and Apache Commons Email both provide an API for sending an email from Java. It is built on top of the JavaMail API, which it aims to simplify.

15. HTML Parsing Libraries

Similar to JSON and XML, HMTL is another common format many of us have to deal with. Thankfully, we have JSoup, which greatly simplifies working with HTML in a Java application.

You can use JSoup to not only parse HTML but also to create HTML documents

Best HTML Parsing libraries for Java developers

It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. JSoup implements the WHATWG HTML5 specification and parses HTML to the same DOM, as modern browsers do.

16.Cryptographic Library

The Apache Commons Codec package contains simple encoder and decoders for various formats, such as Base64 and Hexadecimal.

In addition to these widely used encoders and decoders, the codec package also maintains a collection of phonetic encoding utilities.

Top 20 Libraries and APIs Java Developer should know

17. Embedded SQL Database Library

I really love in-memory databases like H2, which you can embed in your Java application. They are great for testing your SQL scripts and running unit tests that need a database. However, H2 is not the only DB, you also have Apache Derby and HSQL to choose from.

18. JDBC Troubleshooting Libraries

There are some good JDBC extension libraries that exist and can make debugging easier, like the P6spy.

This is a library that enables database data to be seamlessly intercepted and logged with no code changes to the application. You can use these to log SQL queries and their timings.

For example, if you are using PreparedStatment and CallableStatement in your code, these libraries can log an exact call with parameters and how much time it took to execute.

JDBC troubleshooting libraries for Java developers

If you want to learn more about JDBC, you can check out the JDBC for Beginners

19. Serialization Libraries

The Google Protocol Buffers are a way of encoding structured data in an efficient— yet, extensible — format. It’s a richer and better alternative to Java serialization. I strongly recommend experienced Java developers to learn Google Protobuf. You can see this article to learn more about the Google Protocol Buffer.

Best Serialization libraries for Java developers

20. Networking Libraries

Some of the useful networking libraries are Netty and Apache MINA. If you are writing an application where you need to do low-level networking task, consider using these libraries. If you want to learn more about networking programming in Java, check out the Java Network Programming – TCP/IP Socket Programming.

Image title

That’s all for now about some of the useful libraries every Java developer should be using. The Java sphere is vast, and you will find tons of libraries for doing different things.

If you want to do anything in Java, more than likely, you will find a library on how to do just that. As always, Google is your best friend to find useful Java libraries, but you can also take a look at the Maven central repository to find some of the useful libraries for your task at hand.

Original Link

Development of a New Static Analyzer: PVS-Studio Java

Picture 3

The PVS-Studio static analyzer is known in the C, C++, and C# worlds as a tool for detecting errors and potential vulnerabilities. However, we have few clients from the financial sector, because it turned out that now Java and IBM RPG are in high demand. As for us, we would like to be closer to the enterprise world, so after some consideration, we decided to start creating Java analyzer.


Sure, we had some concerns. It is quite simple to carve out a niche of IBM RPG analyzers. I am not even sure that there are decent tools for static analysis for this language. In the Java world, things are completely different. There is already a range of tools for static analysis and — to get ahead — you need to create a really powerful tool.

Nevertheless, our company had experience with using several tools for static analysis of Java. We are convinced that many things can be implemented better.

In addition, we had an idea of how to tap the full power of our C++ analyzer into the Java analyzer. But, first things first.


Picture 6

First, it was necessary to decide how we would get a syntax tree and semantic model.

The syntax tree is the base element around which the analyzer is built. When running checks, the analyzer traverses the tree and reviews its separate nodes. It is practically impossible to perform serious static analysis without such a tree. For example, a search for bugs using regular expressions is futile.

It also should be noted that the syntax tree alone is not enough. The analyzer requires semantic information as well. For example, we need to know the types of all elements in the tree, be able to jump to a declaration of a variable, etc.

We reviewed several options for obtaining syntax tree and semantic model:

We gave up on the idea of using ANTLR almost at once since it would unreasonably complicate the development of the analyzer (semantic analysis would have been implemented on our own). Eventually, we decided to settle on the Spoon library:

  • It is not just a parser, but also a whole ecosystem, that provides not only a parsing tree but also abilities for semantic analysis. For example, it allows getting information about variables types, move to variable declaration, and information about a parent class and so on.
  • It is based on Eclipse JDT and can compile the code.
  • Supports the latest Java version and is constantly updated.
  • Presentable documentation and intuitive API.

Here is an example of a metamodel that Spoon provides that we use when creating diagnostic rules:

Picture 10

This metamodel corresponds to the following code:

class TestClass
{ void test(int a, int b) { int x = (a + b) * 4; System.out.println(x); }

One of the nice features of Spoon is that it simplifies the syntax tree by removing and adding nodes to make it easier to work with it. With this, the semantic equivalence of a simplified metamodel to a source metamodel is guaranteed.

For us, this means, for example, that we do not need to care about skipping the redundant parentheses when traversing the tree. In addition, each expression is placed in the block, imports are expended, and some more similar simplifications are performed.

For example, the following code:

for (int i = ((0)); (i < 10); i++) if (cond) return (((42)));

will be formatted to the one shown below:

for (int i = 0; i < 10; i++)
{ if (cond) { return 42; }

Based on the syntax tree, a pattern-based analysis is performed. This is a search for errors in the source code of a program by known code patterns containing an error. In the simplest case, the analyzer performs a search in the tree for places, similar to an error, according to the rules described in the appropriate diagnostic. The number of such patterns is large and their complexity can vary greatly.

The simplest example of detectable error using pattern-based analysis is the following code from a jMonkeyEngine project:

if (p.isConnected()) { log.log(Level.FINE, "Connection closed:{0}.", p);
else { log.log(Level.FINE, "Connection closed:{0}.", p);

When blocks then  and else  of the operator if  fully coincide, there is , most likely,a logic error.

Here is another similar example from the Hive project:

if (obj instanceof Number) { // widening conversion return ((Number) obj).doubleValue();
} else if (obj instanceof HiveDecimal) { // <= return ((HiveDecimal) obj).doubleValue();
} else if (obj instanceof String) { return Double.valueOf(obj.toString());
} else if (obj instanceof Timestamp) { return new TimestampWritable((Timestamp)obj).getDouble();
} else if (obj instanceof HiveDecimal) { // <= return ((HiveDecimal) obj).doubleValue();
} else if (obj instanceof BigDecimal) { return ((BigDecimal) obj).doubleValue();

In this code, there are two identical conditions in a sequence of type if (….) else if (….) else if (….). This code fragment is worth checking for a logical error, or the duplicated code should be removed.

Data-flow Analysis

In addition to the syntax tree and semantic model, the analyzer requires a mechanism for data flow analysis.

Data flow analysis enables you to calculate the possible values of variables and expressions in each point of the program and, thanks to that, find errors. We call these possible values ‘virtual values’.

Virtual values are created for variables, classes’ fields, parameters of methods, and other things at first reference. If it is the assignment, the data flow mechanism computes a virtual value by analyzing the expression at the right. Otherwise, all the valid range of values for this variable type is taken as a virtual value. For example:

void func(byte x) // x: [-128..127]
{ int y = 5; // y: [5] ...

At each change of a variable value, the data flow mechanism recalculates the virtual value. For example:

void func()
{ int x = 5; // x: [5] x += 7; // x: [12] ...

The data flow mechanism also handles control statements:

void func(int x) // x: [-2147483648..2147483647]
{ if (x > 3) { // x: [4..2147483647] if (x < 10) { // x: [4..9] } } else { // x: [-2147483648..3] } ...

In this example, when entering a function, there is no any information about a range of values of the variable x , so the range is set according to the type of a variable (from-2147483648 to 2147483647). Then, the first conditional block places a restriction x > 3 and ranges merge. As a result, the range of values for x  in the then  block is as follows: from 4 up to 2147483647 and in the else block varies from -2147483648 to 3. The second condition x  < 10 is handled similarly.

Besides, there has to be the ability to perform purely symbolic computations. The simplest example:

void f1(int a, int b, int c)
{ a = c; b = c; if (a == b) // <= always true ....

Here, the variable a  is assigned a value c , the variable b  is also assigned the value c , and, then, a  and b  are compared. In this case, to find an error, it is enough to just remember the fragment of the tree that corresponds to the right side.

Here is a slightly more complicated example with symbolic computations:

void f2(int a, int b, int c)
{ if (a < b) { if (b < c) { if (c < a) // <= always false .... } }

In such cases, we have to do with solving a system of inequalities in a symbolic form.

The data flow mechanism helps the analyzer to find errors that are quite difficult to detect using a pattern-based analysis.

Such errors include:

  • Overflows;
  • Array index out of bounds;
  • Access by null or potentially null reference;
  • Pointless conditions (always true/false);
  • Memory and resource leaks;
  • Division by zero;
  • And some others.

Data flow analysis is especially important when searching for vulnerabilities. For example, if a certain program receives input from a user, there is a chance that the input will be used to cause a denial of service or to gain control over the system. Examples may include errors leading to buffer overflows on some data inputs or, for example, SQL injections. In both cases, you need to track data flow and possible values for variables, so that the static analyzer could be able to detect such errors and vulnerabilities.

I should say that the mechanism of the data flow analysis is a complex and extensive mechanism, but, in this article, I briefly touched on the basics of data flow analysis.

Let’s see some examples of errors that can be detected using the data flow mechanism.

Here, you can see this demonstrated in Hive project:

public static boolean equal(byte[] arg1, final int start1, final int len1, byte[] arg2, final int start2, final int len2) { if (len1 != len2) { // <= return false; } if (len1 == 0) { return true; } .... if (len1 == len2) { // <= .... }

The condition  len1 == len2  is always executed, because the opposite check has already been executed above.

Another example from the same project:

if (instances != null) { // <= Set<String> oldKeys = new HashSet<>(instances.keySet()); if (oldKeys.removeAll(latestKeys)) { .... } this.instances.keySet().removeAll(oldKeys); this.instances.putAll(freshInstances);
} else { this.instances.putAll(freshInstances); // <=

Here in the block else , the null pointer dereference occurs. Note: instances  are the same thing as  this.instances .

This can be seen in the following example from the JMonkeyEngine project:

public static int convertNewtKey(short key) { .... if (key >= 0x10000) { return key - 0x10000; } return 0;

Here, the variable key  is compared with the number 65536. However, it is of the type short, and the maximum possible value for short  is 32767. Accordingly, the condition is never executed.

Here is one more example from the Jenkins project:

public final R getSomeBuildWithWorkspace() { int cnt = 0; for (R b = getLastBuild(); cnt < 5 && b ! = null; b = b.getPreviousBuild()) { FilePath ws = b.getWorkspace(); if (ws != null) return b; } return null;

In this code, the variable cnt was introduced to limit the number of steps to five, but a developer forgot to increment it, which resulted in a useless check.

Annotations Mechanism

In addition, the analyzer needs a mechanism of annotations. Annotations are a markup system that provides the analyzer with extra information on the used methods and classes, in addition to the data that can be obtained by the analysis of their signatures. Markup is done manually; this is a long and time-consuming process, because, to achieve the best results, one has to annotate a large number of standard Java classes and methods. It also makes sense to perform the annotation of popular libraries. Overall, annotations can be regarded as a knowledge base of the analyzer about contracts of standard methods and classes.

Here’s a small sample of an error that can be detected using annotations:

int test(int a, int b) { ... return Math.max(a, a);

In this example, the same variable (passed as a first argument) was passed as the second argument of the method  Math.max  because of a typo. Such an expression is meaningless and suspicious.

The static analyzer may issue a warning for such code, as it’s aware of the fact that arguments of the method  Math.max  always have to be different.

Going forward, here are a few examples of our markup of built-in classes and methods:

Class("java.lang.Math") - Function("abs", Type::Int32) .Pure() .Set(FunctionClassification::NoDiscard) .Returns(Arg1, [](const Int &v) { return v.Abs(); }) - Function("max", Type::Int32, Type::Int32) .Pure() .Set(FunctionClassification::NoDiscard) .Requires(NotEquals(Arg1, Arg2) .Returns(Arg1, Arg2, [](const Int &v1, const Int &v2) { return v1.Max(v2); }) Class("java.lang.String", TypeClassification::String) - Function("split", Type::Pointer) .Pure() .Set(FunctionClassification::NoDiscard) .Requires(NotNull(Arg1)) .Returns(Ptr(NotNullPointer)) Class("java.lang.Object") - Function("equals", Type::Pointer) .Pure() .Set(FunctionClassification::NoDiscard) .Requires(NotEquals(This, Arg1)) Class("java.lang.System") - Function("exit", Type::Int32) .Set(FunctionClassification::NoReturn)


  • Class is a class being annotated;
  • Function is a method of the annotated class;
  • Pure is the annotation, indicating that a method is pure, i.e. deterministic and does not have side effects;
  • Set is a set of an arbitrary flag for the method.
  • FunctionClassification::NoDiscard is a flag indicating that the return value of the method must be used;
  • FunctionClassification::NoReturn is a flag that indicates that the method does not return control;
  • Arg1, Arg2, , ArgN – method arguments;
  • Returns is the return value of the method;
  • Requires is a contract for a method.

It is worth noting that, in addition to manual markup, there is another approach to annotating, which is an automatic inference of contracts based on bytecode. It is clear that such an approach allows obtaining only certain types of contracts, but, however, it enables it to receive additional information from all dependencies — not just from those that were annotated manually.

By the way, there is already a tool that is able to infer the contracts like @Nullable and @NotNull based on bytecode — FABA. As far as I understand, the derivative of the FABA is used in IntelliJ IDEA.

At the moment, we are also considering the ability to add the bytecode analysis for obtaining contracts of all methods, as these contracts could well complement our manual annotations.

Diagnostic rules often refer to the annotations. In addition to diagnostics, annotations are used by the data flow mechanism. For example, using the annotation method  java.lang.Math.abs, it can accurately calculate the absolute value of a number. And, we don’t have to write any additional code for that. We only need to correctly annotate a method.

Let’s consider the example of an error that can be found due to annotations from the Hibernate project:

public boolean equals(Object other) { if (other instanceof Id) { Id that = (Id) other; return purchaseSequence.equals(this.purchaseSequence) && that.purchaseNumber == this.purchaseNumber; } else { return false; }

In this code, the method  equals()  compares the object  purchaseSequence  with itself. Most likely, this is a typo and the that.purchaseSequence  should be written on the right but not purchaseSequence.

How Dr. Frankenstein Assembled the Analyzer From Pieces

Picture 4

Since data flow and annotation mechanisms themselves are not strongly tied to a specific language, it was decided to re-use these mechanisms from our C++ analyzer. This lets us obtain the whole power of the C++ analyzer in our Java analyzer within a short time. In addition, this decision was also influenced by the fact that these mechanisms were written in modern C++ with a bunch of metaprogramming and template magic,. Therefore, these solutions are not very suitable for porting into another language.

In order to connect the Java part with the C++ kernel, we decided to use the SWIG (Simplified Wrapper and Interface Generator), which is a tool for automatic generation of wrappers and interfaces for bounding C and C++ programs with programs written in other languages. SWIG generates code in JNI (Java Native Interface) for Java.

SWIG is great for cases when there is already a large amount of C++ code that needs to be integrated into a Java project.

Let me give you a small example of working with SWIG. Let’s suppose we have a C++ class that we want to use in a Java project:


class CoolClass
public: int val; CoolClass(int val); void printMe();


#include <iostream>
#include "CoolClass.h" CoolClass::CoolClass(int v) : val(v) {} void CoolClass::printMe()
{ std::cout << "val: " << val << '\n';

First, you must create a SWIG interface file with a description of all the exported functions and classes. If necessary, you will also need to perform additional settings in this file.


%module MyModule
#include "CoolClass.h"
%include "CoolClass.h"

After that, you can run SWIG:

$ swig -c++ -java Example.i

It will generate the following files:

  •  is a class that we will work with directly in a Java project;
  •  is a module class that contains all free functions and variables;
  •  – Java wrappers;
  • Example_wrap.cxx – C++ wrappers.

Now, you just need to add the resultant  .java  files in the Java project and the .cxx   file in the C++ project.

Finally, you need to compile the C++ project as a DLL and load it in the Java project using  System.loadLibary(): :

class App { static { System.loadLibary("example"); } public static void main(String[] args) { CoolClass obj = new CoolClass(42); obj.printMe(); }

Schematically, this can be represented as follows:

Picture 8

Sure. In a real project, things are not that simple and you have to upscale your efforts:

  • In order to use template classes and methods from C++, you must instantiate them for all template parameters by using the directive %template;
  • In some cases, you may need to catch exceptions that are thrown from the C++ part in the Java part. By default, SWIG doesn’t catch exceptions from C++ (segfault occurs). However, it is possible to do this using the directive %exception;
  • SWIG allows extending the C++ code on the Java side, using the directive %extend.  As for us, in our project, we add the method  toString()  to virtual values to see them in the Java debugger;
  • In order to emulate the RAII behavior from C++, the interface  AutoClosable is implemented.
  • Directors mechanism allows using a cross-language polymorphism;
  • For types that are allocated only inside C++ (in its own memory pool), constructors and finalizers are removed to improve performance. The garbage collector will ignore these types.

You can learn more about all of these mechanisms in the SWIG documentation.

Our analyzer is built using Gradle, which calls CMake and, in turns, calls SWIG and builds the C++ part. For programmers, it happens almost imperceptibly, Because of this, we experience no particular inconvenience when developing.

The core of our C++ analyzer is built under Windows, Linux, and macOS, so the Java analyzer also works in these operating systems.

What Is a Diagnostic Rule?

We write diagnostics themselves and code for analysis in Java. It is stemmed from the close interaction with the Spoon. Each diagnostic rule represents a visitor with overloaded method, where the elements interesting for us are traversed:

Picture 9

For example, this is what a V6004 diagnostic frame looks like:

class V6004 extends PvsStudioRule
{ .... @Override public void visitCtIf(CtIf ifElement) { // if ifElement.thenStatement statement is equivalent to // ifElement.elseStatement statement => add warning V6004 }


For the simple static analyzer integration in the project, we’ve developed plugins for build systems Maven and Gradle. A user just needs to add our plugin to the project.

For Gradle:

apply plugin: com.pvsstudio.PvsStudioGradlePlugin
pvsstudio { outputFile = 'path/to/output.json' ....

For Maven:

<plugin> <groupId>com.pvsstudio</groupId> <artifactId>pvsstudio-maven-plugin</artifactId> <version>0.1</version> <configuration> <analyzer> <outputFile>path/to/output.json</outputFile> .... </analyzer> </configuration>

After that, the plugin will receive the project structure and start the analysis.

In addition, we have developed a plugin prototype for IntelliJ IDEA.

Picture 1

In addition, this plugin works in Android Studio. The plugin for Eclipse is now under development.

Incremental Analysis

We have provided the incremental analysis mode that allows checking only modified files, which significantly reduces the time for analysis. Thanks to that, developers will be able to run the analysis as often as necessary.

The incremental analysis involves several stages:

  • Caching of the Spoon metamodel;
  • Rebuilding of the modified part of the metamodel;
  • Analysis of the changed files.

Our Testing System

To test the Java analyzer on real projects, we wrote special tools to work with the database of open source projects. It was written in Python + Tkinter and is a cross-platform.

It works in the following way:

  • The tested project of a certain version is loaded from a repository on GitHub;
  • The project is built;
  • Our plugin is added to pom.xml or build.gradle (using git apply);
  • Static analyzer is started using the plugin;
  • The resulting report is compared with the etalon for this project.

Such an approach ensures that good warnings will not disappear because of changes in the analyzer code. The following illustration shows the interface of our utility for testing.

Picture 11

Red highlights are the projects, whose reports have differences with the etalon. The Approve button allows saving the current version of the report as an etalon.

Examples of Errors

Below, I will demonstrate several errors from different open source projects that our Java analyzer has detected. In the future, we plan to write articles with a more detailed report on each project.

Hibernate Project

PVS-Studio warning: V6009 Function ‘equals’ receives odd arguments. Inspect arguments: this, 1. 57

public boolean equals(Object other) { if (other instanceof Id) { Id that = (Id) other; return purchaseSequence.equals(this.purchaseSequence) && that.purchaseNumber == this.purchaseNumber; } else { return false; }

In this code, the method  equals()  compares the object  purchaseSequence  with itself. Most likely, this is a typo and  that.purchaseSequence , not  purchaseSequence, should be written on the right.

PVS-Studio warning: V6009 Function ‘equals’ receives odd arguments. Inspect arguments: this, 1. 232

public void removeBook(String title) { for( Iterator<Book> it = books.iterator(); it.hasNext(); ) { Book book =; if ( title.equals( title ) ) { it.remove(); } }

A triggering, similar to the previous one,  book.title , not title, has to be on the right.

Hive project

PVS-Studio warning: V6007 Expression ‘colOrScalar1.equals(“Column”)’ is always false. 2768

PVS-Studio warning: V6007 Expression ‘colOrScalar1.equals(“Scalar”)’ is always false. 2774

PVS-Studio warning: V6007 Expression ‘colOrScalar1.equals(“Column”)’ is always false. 2785

String colOrScalar1 = tdesc[4];
if (colOrScalar1.equals("Col") && colOrScalar1.equals("Column")) { ....
} else if (colOrScalar1.equals("Col") && colOrScalar1.equals("Scalar")) { ....
} else if (colOrScalar1.equals("Scalar") && colOrScalar1.equals("Column")) { ....

Here, the operators were obviously confused and ‘ &&  was used instead of ‘ || ‘.

JavaParser Project

PVS-Studio warning: V6001 There are identical sub-expressions ‘ tokenRange.getBegin().getRange().isPresent() ‘ to the left and to the right of the ‘ && ‘ operator. 213  

public Node setTokenRange(TokenRange tokenRange)
{ this.tokenRange = tokenRange; if (tokenRange == null || !(tokenRange.getBegin().getRange().isPresent() && tokenRange.getBegin().getRange().isPresent())) { range = null; } else { range = new Range( tokenRange.getBegin().getRange().get().begin, tokenRange.getEnd().getRange().get().end); } return this;

The analyzer has detected that on the left and the right of the operator  &&, there are identical expressions. Besides that, all methods in the chain are pure. Most likely, in the second case,  tokenRange.getEnd()  has to be used rather than the  tokenRange.getBegin().

PVS-Studio warning: V6016 Suspicious access to element of ‘ typeDeclaration.getTypeParameters() ‘ object by a constant index inside a loop. 265  

if (!isRawType()) { for (int i=0; i<typeDeclaration.getTypeParams().size(); i++) { typeParametersMap.add( new Pair<>(typeDeclaration.getTypeParams().get(0), typeParametersValues().get(i))); }

The analyzer has detected a suspicious access to the element of a collection by constant index inside the loop. Perhaps, there is an error in this code.

Jenkins Project

PVS-Studio warning: V6007 Expression ‘cnt < 5’ is always true. 557

public final R getSomeBuildWithWorkspace() { int cnt = 0; for (R b = getLastBuild(); cnt < 5 && b ! = null; b = b.getPreviousBuild()) { FilePath ws = b.getWorkspace(); if (ws != null) return b; } return null;

In this code, the variable cnt  was introduced to limit the number of traverses to five, but a developer forgot to increment it, which resulted in a useless check.

Spark Project

PVS-Studio warning: V6007 Expression  'sparkApplications != null ‘ is always true. 127

if (StringUtils.isNotBlank(applications)) { final String[] sparkApplications = applications.split(","); if (sparkApplications != null && sparkApplications.length > 0) { ... }

The check of the result, returned by the split  method, for null  is meaningless, because this method always returns a collection and never returns null.

Spoon Project

PVS-Studio warning: V6001 There are identical sub-expressions ‘ !m.getSimpleName().startsWith("set") ‘ to the left and to the right of the ‘ && ‘ operator. 108

if (!m.getSimpleName().startsWith("set") && !m.getSimpleName().startsWith("set")) { continue;

In this code, there are identical expressions on the left and right of the &&   operator. In addition to that, all methods in the chain are pure. Most likely, there is a logic error in the code.

PVS-Studio warning: V6007 Expression ‘ idxOfScopeBoundTypeParam >= 0 ‘ is always true. 243

private boolean
isSameMethodFormalTypeParameter(....) { .... int idxOfScopeBoundTypeParam = getIndexOfTypeParam(....); if (idxOfScopeBoundTypeParam >= 0) { // <= int idxOfSuperBoundTypeParam = getIndexOfTypeParam(....); if (idxOfScopeBoundTypeParam >= 0) { // <= return idxOfScopeBoundTypeParam == idxOfSuperBoundTypeParam; } } ....

Here, the author of the code made a typo and wrote  idxOfScopeBoundTypeParam  instead of idxOfSuperBoundTypeParam.

Spring Security Project

PVS-Studio warning: V6001 There are identical sub-expressions to the left and to the right of the ‘ || ‘ operator. Check lines: 38, 39. 38  

public boolean equals(Object obj) { return obj instanceof AnyRequestMatcher || obj instanceof security.web.util.matcher.AnyRequestMatcher;

The triggering is similar to the previous one — the name of the same class is written in different ways.

PVS-Studio warning: V6006 The object was created but it is not being used. The ‘ throw ‘ keyword could be missing. 434

if (!expectedNonceSignature.equals(nonceTokens[1])) { new BadCredentialsException( DigestAuthenticationFilter.this.messages .getMessage("DigestAuthenticationFilter.nonceCompromised", new Object[] { nonceAsPlainText }, "Nonce token compromised {0}"));

In this code, a developer forgot to add the throw  before the exception. As a result, the object of the exception  BadCredentialsException  is created, but it is not used, i.e., no exception is thrown.

PVS-Studio warning: V6030 The method located to the right of the ‘ | ‘ operators will be called regardless of the value of the left operand. Perhaps, it is better to use ‘ || ‘. 38

public void setScheme(String scheme) { if (!("http".equals(scheme) | "https".equals(scheme))) { throw new IllegalArgumentException("..."); } this.scheme = scheme;

In this code, the usage of the operator  |  is undue, because the right part will be calculated, even if the left part is already true. In this case, it has no practical meaning, so the operator  |  has to be replaced with  || .

IntelliJ IDEA Project

PVS-Studio warning: V6008 Potential null dereference of ‘editor’.

final PsiElement nameSuggestionContext = editor == null ? null : file.findElementAt(...); // <=
final RefactoringSupportProvider supportProvider = LanguageRefactoringSupport.INSTANCE.forLanguage(...);
final boolean isInplaceAvailableOnDataContext = supportProvider != null && editor.getSettings().isVariableInplaceRenameEnabled() && // <=

In this code, the analyzer has detected that a dereference of a null pointer editor  may occur. It is necessary to add an additional check.

PVS-Studio warning: V6007 Expression is always false.

public boolean contains(@NotNull VirtualFile file) { .... return false & !myProjectFileIndex.isUnderSourceRootOfType(....);

It is difficult for me to say what the author had in mind, but this looks very suspicious. Even if there is no error here, this place should be rewritten to not to confuse the analyzer and other programmers.

PVS-Studio warning: V6007 Expression ‘ result[0]‘ is always false.

final boolean[] result = new boolean[] {false}; // <=
Runnable command = () -> { PsiDirectory target; if (targetDirectory instanceof PsiDirectory) { target = (PsiDirectory)targetDirectory; } else { target = WriteAction.compute(() -> ((MoveDestination)targetDirectory).getTargetDirectory( defaultTargetDirectory)); } try { Collection<PsiFile> files = doCopyClasses(classes, map, copyClassName, target, project); if (files != null) { if (openInEditor) { for (PsiFile file : files) { CopyHandler.updateSelectionInActiveProjectView( file, project, selectInActivePanel); } EditorHelper.openFilesInEditor( files.toArray(PsiFile.EMPTY_ARRAY)); } } } catch (IncorrectOperationException ex) { Messages.showMessageDialog(project, ex.getMessage(), RefactoringBundle.message("error.title"), Messages.getErrorIcon()); }
CommandProcessor processor = CommandProcessor.getInstance();
processor.executeCommand(project, command, commandName, null); if (result[0]) { // <= ToolWindowManager.getInstance(project).invokeLater(() -> ToolWindowManager.getInstance(project) .activateEditorComponent());

Here, I suspect that someone forgot to change the value in result. Because of this, the analyzer reports that the check  if (result[0])  is meaningless.


Java development is very versatile. It includes desktop, Android, web, and much more, so we have plenty of room for activities. First and foremost, of course, we will develop the areas that are in the highest demande.

Here are our plans for the near future:

  • The inference of annotations from bytecode;
  • Integration into Ant projects (does anybody still use it in 2018?);
  • Plugin for Eclipse (currently in the development process);
  • More diagnostics and annotations;
  • Improvement of data flow.

Original Link

The Latest in GitHub, GitLab, and Git

This is your one-stop shop for GitHub and GitLab news: read developer opinions of the recent GitHub acquisition by Microsoft, new releases from GitLab (hint: a web IDE), and some tips and tricks in Git for good measure. 

The GitHub/Microsoft News

  1. Microsoft and GitHub: A Great Step Forward for DevOps, by Sacha Labourey. Microsoft recently announced their acquisition of GitHub, to mixed reactions. Let’s talk about what this really means for developers and companies.

  2. Making GitHub Easier to Use, by Tom Smith. See what makes successful as it enables project management for developers using GitHub.

  3. DevOps and Version Control: Why Microsoft Had to Get GitHub, by Yariv Tabac. It’s no secret that large corporations make frequent use of open-source software. Learn why OSS is necessary for DevOps and version control.

  4. Where Developers Stand on Microsoft Acquiring GitHub, by Alex McPeak. Microsoft has bought GitHub — check out this take on the acquisition from the developers at SmartBear.

GitLab Updates

  1. Meet the GitLab Web IDE, by Dimitrie Hoekstra. Learn about GitLab’s newly announced Web IDE, its current capabilities, and integration of more advanced features it’ll be rolling out in the future.

  2. GitLab: We’re Moving From Azure to Google Cloud Platform, by Andrew Newdigate. GitLab has decided to move from Azure to Google Cloud Platform to improve performance and reliability. Read on for the details of the migration.

Git Tips and Tutorials

  1. How (and Why!) to Keep Your Git Commit History Clean, by Kushal Pandya. Learn why commit messages are so important for organizing your Git repo and methods for keeping your logs in order.

  2. Quick Tip: Grep in Git, by Robert Maclean. Learn about two features in Git to simplify searching: git-grep and git-log grep.

  3. Git on the Go With These Mobile Apps for Git (and GitHub), by Jordi Cabot. On the go a lot but still need to keep an eye on your git repositories? Check out these apps for your smartphone.

You can get in on this action by contributing your own knowledge to DZone! Check out our new Bounty Board, where you can claim writing prompts to win prizes! 

Dive Deeper Into DevOps

  1. DZone’s Guide to DevOps: Culture and Process: a free ebook download.

  2. Introduction to DevSecOps: Download this free Refcard to discover an approach to IT security based on the principles of DevOps. 

Who’s Hiring?

Here you can find a few opportunities from our Jobs community. See if any match your skills and apply online today!

Senior DevOps Engineer
Clear Capital
Location: Roseville, CA, USA
Experience: AWS Foundational Services, configuration management tools, Continuous Integration/Continuous Delivery knowledge, Linux systems scripting, high-level programming language such as Python or Ruby.

Open Technology Architect
Location: Dallas, TX, USA
Experience: An experienced, “hands-on” software architect that has been in the industry or technology consulting for at least 6 years, with a passion for open-source technologies and the art of software engineering.

Original Link

What’s New In Eclipse JNoSQL 0.0.6

Eclipse JNoSQL is a framework that makes the integration between Java and NoSQL easier with an extensible API that allows both to use common features as standards calls and expandable enough to allow to use particular features from any specific vendor. The version 0.0.6 comes with the hot query as text. This post will cover this release.

The hottest release at this version was the query by String, explained here. This feature creates a text query that uses the standard API. Briefly, it converts the text query to a method at an entity manager.

Image title

So beyond the Graph, that already has gremlin as Graph query, each NoSQL database type API has a specific query.

DocumentTemplate documentTemplate = ..;
ColumnTemplate columnTemplate = ...;
KeyValueTempalte keyValueTemplate =...;
GraphTemplate graphTemplate =...;
List<Movie> movies = documentTemplate.query("select * from Movie where year > 2012");
List<Person> people = columnTemplate.query("select * from Person where id = 12");
Optional<God> god = keyValueTemplate.query("get \"Diana\"");
List<City> cities = graphTemplate.query("g.V().hasLabel('City')");

To run a query dynamically, use the prepare method. It will return a PreparedStatement interface. To define a parameter to key-value, document, and column query, use the “@” in front of the name.

PreparedStatement preparedStatement = documentTemplate.prepare("select * from Person where name = @name");
preparedStatement.bind("name", "Ada");
List<Person> adas = preparedStatement.getResultList();
//to graph just keep using gremlin
PreparedStatement prepare = graphTemplate().prepare("g.V().hasLabel(param)");
prepare.bind("param", "Person");
List<Person> people = preparedStatement.getResultList();


The Repository interface contains all the trivial methods shared among the NoSQL implementations that a developer does not need to care about. Also, there is query method that does a query based on the method name. The next version brought two new annotations: the Query and param that defines the statement and set the values in the query respectively.

 interface PersonRepository extends Repository<Person, Long> { @Query("select * from Person") Optional<Person> findByQuery(); @Query("select * from Person where id = @id") Optional<Person> findByQuery(@Param("id") String id);

Remember, when a developer defines who that repository will be implemented from, the CDI qualifier, the query will be executed to that defined type, eg. gremlin to Graph, JNoSQL key to key-value, and so on.

@Database(value = DatabaseType.COLUMN)
private PersonRepository repository;


This article covered how a Java developer can make a smooth integration between Java and NoSQL with Eclipse JNoSQL. At this version, query as the text was the most expected feature. This feature allows a single consult in the database. However, that does forget the particular query that each NoSQL provider has. It is important to point out the focus of the frameworks is to make the integration easier, but the NoSQL about the database is still required. To the next step, more databases are expected and also the possibility to create the API and get it in in the Jakarta EE process, despite the need to wait for the EE4P specification process.

Click here to learn more about this version.

Original Link