CloudBees teamed up with IDG to conduct a DevOps measurement survey to investigate how well organizations think they are doing with their DevOps journey. We spoke to 100 respondents consisting of IT executives, IT operations, engineering and shared services who are all involved with DevOps adoption in their organizations. These organizations represent a cross-section of small, medium and large businesses. The majority of our survey takers work in the technology, manufacturing, and financial services vertical industries throughout North America.
Our initial findings include:
In helping enterprises transform their engineering organizations and apply DevOps practices on AWS, we are often introduced to all types of legacy systems and processes.
Let’s imagine an enterprise that might have hundreds of brownfield applications and services and they want to move some of these applications to AWS. Let’s also assume they don’t see AWS as another data center with APIs but more as a platform on which to transform their applications and services. That said, they have an existing group of engineers who are excited but not yet ready for the shift of moving all of their applications to leverage a Serverless paradigm — at least for application development. They want to ensure there are tight controls and auditing on all changes to the software systems including infrastructure, application code, configuration, and data.
Efficiency is one of the fundamental pillars upon which any successful business is built. We’re all familiar with the old adage that “Time is money,” but many companies don’t appreciate just how deep this simple observation runs.
Amongst businesses who do prioritize the efficiency of their operations, there is an ongoing need to devise better workflows and to identify where other inefficiencies lie.
Mainframe-powered firms are all at unique points in their mainframe DevOps journeys. We know this through the DevOps Transition Workshops we conduct with customers; the DevOps data we collect from customers leveraging Compuware zAdviser; and the insight we glean from industry thought leaders, analysts and partners.
Whether that means your journey hasn’t begun or you’re using CI/CD pipelines, we want to help you learn and continue to leverage the power of mainframe DevOps. We’ve made it easy to discover where you are and where you’re going on your mainframe DevOps journey, with five stages that offer supporting resources to help you move forward.
To learn more about what works and what doesn’t in large-scale DevOps and agile deployments, we need data. The problem is, that data is notoriously difficult to get ahold of because much of it lies hidden across numerous private repositories.
Efforts such as The State of DevOps reports have helped us gain some understanding by using survey data to answer questions about practices such as the frequency of deployments in a team or organization. However, survey data has its limitations, as Nicole Forsgren and I described in "DevOps Metrics," which we wrote to clarify the trade-offs of system and survey data collection. 1 Today, our understanding of DevOps practices is based largely on this survey data and on anecdotal evidence. Is there a way to expand our view of DevOps to include studies of system data of DevOps at work?
To understand the current and future state of DevOps, we spoke to 40 IT executives from 37 organizations. We asked them, "What’s the future of DevOps from your perspective, where do the greatest opportunities lie?" Here’s what they said:
Here’s who shared their insights with us:
To understand the current and future state of DevOps, we spoke to 40 IT executives from 37 organizations. We asked them, "Do you have any concerns regarding the current state of DevOps?" Here’s what they said:
Here’s who shared their insights with us:
To understand the current and future state of DevOps, we spoke to 40 IT executives from 37 organizations. We asked them, "What are the most common DevOps fails?" Here’s what they said was typically missing:
To understand the current and future state of DevOps, we spoke to 40 IT executives from 37 organizations. We asked them, "How has DevOps changed since you began using the methodology?" Here’s what they said:
Here’s who shared their insights with us:
Its already over again — the annual get-together of the brightest DevOps minds (well, the brightest who could make it to Vegas). And in this instance, I want to make sure that what happens in Vegas does not stay in Vegas by sharing my highlights with all of you. It was a great event with a slightly higher focus on operations than last time.
The four trends that I picked up on:
To understand the current and future state of DevOps, we spoke to 40 IT executives from 37 organizations. We asked them, "What do you consider to be the most important elements of a successful DevOps implementation?" Here’s what they said:
Here’s who shared their insights with us:
In his opening remarks to the Las Vegas iteration of DevOps Enterprise Summit 2018, Gene Kim set the tone for Day One by providing an alternative definition of DevOps via Jonathan Smart (formerly of Barclays): “Better Value Faster, Safer, Happier.” At the London show in June, Jonathan expanded on that by extolling the pressing need to go beyond Agile and DevOps to think more end-to-end about how we build products.
Four months later, amid the glowing neon lights of Sin City, that key message has evolved and intensified. Traditional organizations are moving away from a project approach towards a product-centric model, and focusing on the flow of work and business value across their value streams. That sentiment was echoed across the show floor and the stage as the DevOps and wider IT community considered how to take DevOps and enterprise software delivery to the next level. How do they transform their business in the Age of Digital Disruption? How does IT deliver more value to the business?
In the enterprise, application delivery has always been challenging. Enabling teams across all lines of business to converge for production is no mean feat – particularly when those teams are so disparate they might as well be separate companies. But what is the unified goal that binds these teams together? Increasingly, it appears to be the directive to "go faster" and ultimately to deliver software more quickly.
A proven methodology to achieving greater speed at scale is DevOps. If you can make Dev and Ops work better while breaking up delivery into smaller chunks, then you’re good to go. An integral part of the road to improvement is technology-driven; organizations need to adopt agile technologies in order to increase operational efficiency. However, strong communication and collaboration are also key necessities to make the concept a reality.
Early DevOps practitioners have shown DevOps to be more than just a cultural aspect or a set of tools – they have confirmed it to be a crucial success factor and a competency well worth developing in today’s environment of rapid evolution, technological advancement, and huge customer or employee expectations. The demand for DevOps in organizations is high and need of the hour, but it is not something that can be adopted on to the average team just like that. When this happens, the current organizational undercurrents will weaken the effectiveness of such a program. Rather, the development, operations, and overarching management processes must be redesigned anew and from the scratch. DevOps can be profoundly disruptive to a business, it has an enduring and strong impact on organizational success. After all, IT is the core of almost any business and the effectiveness and agility gained there will have a notable impact on the readiness and coordination of the organization as a whole.
The term DevOps has entered into our general language and has gathered much attention and focus these days.
DevOps, as we all know, it is the word on the lips of every software enthusiast around the world. DevOps has come a long way and is taking center stage in the world of software development.
DevOps is a cultural methodology in which professionals are groomed to adapt to a new and emergent environment in software-powered organizations. Teams are trained to follow a specific set of rules, steps, and tools to achieve success in DevOps. When it comes to DevOps adoption, having the right attitude, employing the proper tools, and learning the stages of the DevOps cycle are imperative.
The journey to digitally transforming your company is a long one, with plenty of roadblocks and dead ends along the way. However, there’s a way to speed up the journey — ask the right questions before you set out.
MIT Researcher Stephanie Woerner offers advice on the right questions to ask in an interview on the Knowledge@Wharton website, "Six Questions That Can Help Guide Digital Transformation." Woerner and co-researcher MIT Peter Weill wrote the book, "What’s Your Digital Business Model? Six Questions to Help You Build the Next-Generation Enterprise," based on several years of study at the MIT Center for Information Systems Research.
IT budgets are getting increasingly difficult to secure. Individual IT teams are trying to grab a piece of the pie. However, at the same time, more and more pressure is on the IT department to improve efficiency and performance without raising the cost.
This has led to the development of DevOps, a shift from siloed IT teams to consistent and collaborative teamwork in IT groups. This has bought a change in the way IT teams function and operate and has supported the improvement of cloud hosting. Let’s review some ways in which this transformation can help in meeting enterprise goals and objectives on a constricted budget.
Many businesses are hell-bent on doing some kind of transformation. While DevOps is definitely not as ubiquitous as digital or agile transformation (17% of software developers have fully embraced DevOps in 2018), the hype is real. If you aren’t doing DevOps in your IT organization, you’re not taking advantage of such benefits as continuous software delivery, faster problem resolution, more productive teams, improved communication and collaboration, and so on. At least the talk goes like that.
While there’s no denying that DevOps is good for basically any company, many IT organizations bite off more than they can chew when kicking off their DevOps transformation. Some of them try to do everything at once, pushing their teams into disruption and dissatisfaction. Others simply cherrypick certain practices that they like and end up with dysfunctional “pockets” of DevOps.” Of course, this results in misaligned pipelines, broken down processes, and burned-out employees. And there are many other reasons for DevOps failure.
I always loved this quote: "Nothing is more dangerous than using yesterday’s logic for today’s problems," which shows you that you just cannot afford to get lazy and do the same thing again and again. This causes larger problems when you scale it up. Gary Hamel summarizes the problem our organizations face as follows: "Right now, your company has 21st-century Internet-enabled business processes, Mid 20th-century management processes all built atop 19th-century management principles."
One of the main reasons for me to write "DevOps for the Modern Enterprise" was to help address this mismatch between the work we need to do, creative IT based problem solving, and the management mindset many managers still have, that of IT being managed just like manufacturing.
Today’s scaling industries aim to provide large productivity gains, but they have to deal with a wide variety of automation challenges, which are overcome by tools such as Ansible. Let’s start with what Ansible Tower is.
Ansible Tower is Ansible at an enterprise level. It is a web-based solution for managing your organization with a very easy user interface that provides a dashboard with state summaries of all the hosts, allows quick deployments, and monitors all configurations. Tower allows you to share SSH credentials without exposing them, log all jobs, manage inventories graphically, and sync them with a wide variety of cloud providers.
Every business will be a software business. Software is the backbone of an organization and dictates its ability to innovate, improve and react to changes in the market. That puts a big burden on the company’s IT organization to deliver high-quality software quickly and frequently.
Continuous delivery (CD) performance matters not only for IT organizations but also a company’s ability to be competitive. Indeed, the DevOps Research and Assessment (DORA) released its Accelerate: State of DevOps 2018 report which details the correlation between software delivery and organizational performance.
Most organizations understand that to increase competitiveness in this rapidly changing world, it is essential to obtain digital transformation. DevOps, as well as cloud computing, are oft-touted as the crucial ways for companies to achieve their needed transformation. The association between these two is confusing as cloud computing is about technology — its tools and services — whereas DevOps are about the processes and their improvement. Not mutually exclusive, it’s crucial to understand how DevOps and cloud work together, helping businesses achieve their transformational goals.
Agility is the core component of this relationship and DevOps behind the agile methods provides the automation. Traditional platforms need weeks or months of planning for the necessary software and hardware. The automated provisioning with virtualization of these resources can’t quickly be done on demand.
DevOps is a key element of many enterprise IT strategies today as digital transformation drives the need for greater efficiency and higher speed. However, teams responsible for Enterprise Resource Planning (ERP) software like SAP sometimes feel like they’re not surfing the same wave.
Do you think that DevOps isn’t for you? That it’s not relevant to SAP? Perhaps you understand the value and are trying to achieve a culture of DevOps for SAP, but find yourself unsure as to whether you’re succeeding? Maybe you’re even reading this thinking "What is DevOps?" You might want to check out a previous blog I wrote for more on this and about laying the foundation for DevOps success.
DevOps has become more than a trend—it’s a survival imperative for the enterprise. In today’s digital economy, software innovation drives business innovation. The faster developers can deliver on the next wave of software innovation, the faster the business can deliver customer value, bring new revenue streams online, and respond to market events. DevOps practices across the enterprise can deliver business results at the speed and quality customers expect.
Many IT organizations start their DevOps journey implementing automation and tools only to quickly face hurdles when trying to scale DevOps practices across the organization. Their journey starts to take a detour as they struggle with organizational boundaries, unwieldy system-wide processes, and cultural resistance to change. It’s common to blame the people and teams that are not getting on board, but to quote Edward Deming, “People work in the system, management creates the system.”
With over 80,000 employees, 150 million customers a year, and 800 aircraft with service to 57 countries, there is no denying the size and reach of Delta Air Lines. And, with a 94-year history, you can imagine, they have lots of legacy technology.
This creates challenges, of course, but times are changing, and Chris Bolton, an application developer, and Jasmine James, a development tools engineer, teamed up at the 2018 Nexus Users’ Conference to discuss how they rolled out Nexus at Delta.
This is the eleventh in a series of blogs about enterprise DevOps. The series is co-authored by Nigel Willie, DevOps practitioner, and Sacha Labourey, CEO, CloudBees.
By nature, technicians are problem solvers. Many companies assess an individual’s ability to solve problems as a precursor to an employment offer.
This morning on our Continuous Discussions (#c9d9) podcast, we had a great discussion with a panel of DevOps Enterprise Summit Las Vegas 2018 speakers and programming committee members to discuss a major theme top of mind for many at the fifth annual USA conference.
This year, "Next Generation Ops and Infrastructure Practices" gets moved under a larger magnifying glass. Targeted specifically at Ops leadership, and the beginning stages of a greater focus on a multi-year roadmap to properly define the problem space, we continued the discussion from DOES18 London here with some of the top minds in DevOps.
In DevOps, culture is as important as the process and tools. That’s why buy-in, from everyone, including the CEO, is essential for success. Even organizations that clearly recognize the business value of adopting a DevOps approach may face a variety of potential stumbling blocks. One of the most prevalent challenges is inherent in the divergence of what developers and operations professionals prioritize and value most-and the traditional development practices that amplify those differences.
Development is concerned with speed and agility while operations is focused on quality and stability. As a result, there are often organizational barriers between the development team creating new software and the operations team responsible for pushing changes to production. Some of those barriers are by design, even though they often bog down processes and create tension between the teams. Most organizations tend to be risk-averse. They don’t want to risk compromising quality or security for the sake of speed, so legacy software development processes and tools are designed to reduce risk and ensure quality, not for speed and agility.
For a large organization, implementing and successfully adopting continuous delivery (CD) within an app system is a challenging, even daunting, task. The idea of overhauling a preexisting way of working can seem like taking a leap into the dark, which often causes executives to balk at the idea. Yet, the jump is not actually that large. The technical issues are well understood and there are plenty of tools available to get a basic, functional infrastructure ‘working’ with a bit of effort.
However, ‘working’ is not enough for modern enterprises. Large organizations are concerned with higher level capabilities and operational approaches that reflect the value of the actual scale at which they exist. These companies want to maximize and exploit the advantages of their size. Enterprises that cannot take advantage of their scale will struggle as competitors and market disruptors alike seek to alter the status quo.
How does the enterprise attain and deliver on the promise of CD at scale? To answer that question, I’d like to look at an interrelated list of “-tions”:
There is no escaping the fact that large enterprises are concerned with who does what. Old-school hierarchies usually think of that in terms of pushing “authority” to do something “downward.” Indeed, this is the way most people think of the word ‘delegation’. In organizations attempting to adopt enterprise scale continuous delivery, there is also the challenge of horizontal delegation-from one team, or silo, to another. Continuous delivery only works when long communication, decision-making, or approval-granting chains are minimized or eliminated. This can be a challenge for enterprises that are used to very bureaucratic approaches to IT. Delegating control to those closest to the work is critical if the organization is to achieve agility.
Collaboration is closely related to delegation. It is an important enabler of trust, which enhances the willingness to delegate. It also incorporates key communications channels and behaviors that let people learn, share knowledge, and generally improve the organization’s ability to execute rapidly. Maintaining a rapid execution cadence is a key goal of CD and ultimately a reason enterprises are interested in it as a practice.
Large organizations, given their disparate teams, technologies, and even geographies, must deliberately invest time and effort into fostering these interactions. Merely saying people are ’empowered’ is not enough-goals and incentives must be aligned within and, crucially, across organizations in the same value streams. Otherwise, team-centric interests can interfere with delivering new functionality.
Automation is the capability that makes enterprise scale continuous delivery possible. It is the backbone of a modern software factory.
Automation serves as a source of great efficiency for repetitive tasks, executing them in a consistent, fast and reliable fashion. This shifts staff time from repetitive, low-value tasks to creative efforts that bring higher business value resulting in greater productivity for the business.
Successful automation relies on clear, cross-functional understanding of the overall process and its value increases the more broadly it is used. Good collaboration is crucial to achieving the understanding required to establish and extend automation within processes-particularly those that cross team boundaries.
Being a large enterprise enables you to leverage scale and achieve higher levels of efficiency. This is achieved via shared resources and reducing duplication of effort, which can be a major management challenge for enterprises because, as we have discussed, CD pushes a lot of activities to teams. This creates pressure in the enterprise as CD practices can lead to teams duplicating facilities that other teams have created.
However, the knee-jerk reaction to deduplicate efforts may actually yield a worse outcome for the enterprise as a whole. There is mounting research that the traditional pursuit of efficiency will generally yield worse business outcomes when it comes to delivering software. That said, too much duplication will also yield negative outcomes-remember that even a high-speed tech titan like Google enforces a single code repository standard because that is how it manages its intellectual property.
Striking the correct balance requires good automation. By eliminating repetitive tasks, it can empower teams to use a central resource in a self-service manner. Knowing what is critical to maintain centrally, such as intellectual property or compliance, is a key part of intelligent delegation.
The first four “-tions” are all related to each other and there is a delicate balance required to successfully adopt CD. The fifth “-tion,” Instrumentation, sits at the heart of the CD practice and enables all participants in the enterprise to contribute to its success. Instrumentation provides the trust, transparency, and communication required to maintain and improve the CD practice in a sustainable way that benefits a large enterprise.
The 5 “-tions” outlined above serve as a framework that can help focus conversation and effort around enterprise scale CD. It can be a tricky balancing act, but the research published in books such as Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations, by Nicole Forsgren, Jez Humble and Gene Kim, reflects that companies that successfully adopt modern practices such as CD will substantially outperform their rivals.
Enterprises that want the benefits of DevOps practices such as CD must act deliberately. Early successes from individual teams or “grassroots” efforts are good and provide real data that such approaches are feasible. However, without a coherent framework such as these 5 “-tions,” scaling the early successes across the numerous and diverse teams within an enterprise will be impossible.
Hello there, once again, and welcome to another hot shot. My name is Peter Pilgrim, Platform engineer and DevOps specialist, and Java Champion.
What is a definition of decent DevOps? Here is a simple explanation. It is a combination of software development skills with that of a system administrator.
If you know how to program and very well, then you are similar to me, then your approach to DevOps is closer to that of a tactical and top flight software engineer.
You most likely have been programming application software almost all of your life. You can write decent quality and productive programs, impressive ones to boot. You possess creative skills, you already know plenty of programming languages from experiences. You might be efficient in high-level programming on the JVM such as the mother of them all, Java. You already might have an alternative language under your belt, such as Groovy, Scala and even Clojure. If it is not the JVM, then you have experience of native programming such as C / C++ from way back in the day. In the modern terms, you might have developed using GO or RUST. If you have no JVM experience, then you might have come into this from the Microsoft community with C#, Dot Net and if you are truly hipster, you might have Ruby experience.
(These notes are a little bit more off the cuff than usual — my apologies)
System administration is to ensure that all related computer systems and services keep working, available for use by the business (a system can be a website, it can be a persistent store, LDAP and a suite of core applications).
A system administrator, or sysadmin, is a person responsible to maintain and operate a computer system or network for a business. The role of a system administrator can vary a lot from one organization to another. For example, inside one organization, such as a forward-looking financial institution, you might be looking about AWS and PaaS solution. In another organization, such as a utility energy business, you may only have to deal with a traditional RDBMS and Java application servers.
In the old school way of thinking, classic System administrators are usually charged with installing, supporting, and maintaining servers or other computer systems, and planning for and responding to service outages and other problems. In the new school of thinking, especially with the cloud computing, system admins are heavily involved in automation features, monitoring and eventually site reliability.
System administrators share programming duties with their classic software engineering folk. Practically, every system administrator knows how to program. They know how to test code (whether test-driven or not). Other duties may include scripting or light programming, project management for systems-related projects, supervising or training computer operators, and being the equivalent of a handyman for computer problems beyond the knowledge of technical support staff.
I didn’t have an operational background, but throughout my career, I had to dig under the hood of the computer, to install databases, install and understand operating systems, such as Linux, various distributions such as SuSE, Red Hat and Ubuntu. I even stuck the CD ROM into machines into install Solaris. I learned the first thing about continuous integration by configuring Jenkins, web server management, application servers, database management, bug trackers, setting up fundamental security for users and learning to program using scripting languages. (*Skipped* ad-libbed at the beginning).
So modern DevOps encompasses both software engineering and system administration ideas. These are reasons why people and businesses are into DevOps:
Notice that I say very little about productivity in the above bullet items. This is because, DevOps if organized and executed with the correct procedure, attitude and organizational freedom ought to provide better visibility, transparency, enhanced feedback and human teamwork. The road to DevOps can be mudded severely through corporate misunderstanding, cultural organization, and other economic factors. DevOps is easier to “install” in a new young company, division, or department. It is far harder to provide the function is organizations with legacy technology dependencies and waterfall software development process and methodology.
From the business point of view, they are already might understand two concepts: SLO and SLI.
SLO – service level objective – this specifies the SMART goals that define business application reliability. These objective define the minimum requirement for business operations to function in order to do the productive work. It might be building the next fleet of cars in a manufacturing or it could be how many foreign exchange payments will be processed in the month of May, next year. These are achievable goals and not vague and so the idea that I want my website up and running for 24/7/365 is simply not good enough for an SLI.
SLI – service level indicator – this specifies the ways technical, business related to how we can measure through metrics that we are reaching the SLO. These are definite, clear and concise, periodic and reliability measurements that demonstrate how the business application is hitting its set goals. So this means that we have indicators from various analytical tools, prefer dynamic over static, that humans can digest in real time. I am glossing a bit over automatic scaling of cloud runtime instances, but there is no point is having auto-scaling if you can understand the measurement data that you are reading. If you cannot intercept, don’t do it.
This is, for me, the high-level view of what is Developer Operations. It is about building application build pipelines, providing and securely manage in-house and third-party artifact or assembly dependencies, managing persistence datastores, short term and long term (like data laike), it is about providing functionality to monitor applications, analyse the data going in and out of the applications, lifecycle and payload and of course the best is saved for last, the biggest and important one, security protecting user’s data GPDR, and commmercial data.
DevOps and platform engineering are essentially about building an automated pipeline that allows implementation of business logic and ideas to be coded and then deployed to production. Some people will want to throw the C word “continuous” in that last sentence, however, I recommend strongly that you crawl before you walk, you walk before you run.
I think that I will stop here. Bye for now.
This article is featured in the new DZone Guide to Automated Testing: Your End-to-end Ecosystem. Get your free copy for more insightful articles, industry statistics, and more!
1. The keys to automated testing are to save time and improve quality while realizing the goals and objectives of the organization. Value drivers are quality and velocity, and they need to be prioritized first. 70% of the time, clients are looking for both and it can take multiple meetings to decide what to prioritize. Automation helps customers with a quality assurance process that runs in minutes and hours versus days and weeks, all while removing errors.
The organization must understand its goals. Planning is important. Many organizations are facing the traditional issue of “we’re too busy to improve, we don’t have time to think about test automation, but we know we are going to fail if we don’t have it in place soon.” They need to understand the end goal of doing this kind of testing.
There are four key factors of success: 1) level of automation (must be 90%+ for CI/CD to succeed); 2) maximize coverage of all flows, all datasets, quality of app and UX; 3) creating an efficient feedback loop with data; and 4) adopting an integrated approach and platform for testing across all devices. Organizations need to be able to do end-to-end DevOps testing throughout the SDLC. Testing needs to shift left at conceptualization.
2. The most significant change to automated testing in the past year has been increased adoption. This is attributed to the maturation of Agile methodologies and adoption of the DevOps culture. Agile-driven teams are no longer separating automation from development. Organizations are realizing as complexity increases, automation is the only way to keep up with all of the changes. Also, the continued evolution of tools and frameworks is easier to work with. Enablement tools integrate automatically, so tests can be built quickly.
3. There are more than a dozen benefits to automated testing being seen in six key industries. Automated testing is invaluable for: 1) saving time by running tests automatically 24/7; 2) reporting with daily insights and accurate feedback; 3) consistency and accuracy; 4) saving money; 5) reducing resources (i.e. manual testers); and 6) achieving full coverage. Manual testing can achieve 6% code coverage, while well-written automated tests can achieve 100% coverage. Automated testing is helping organizations achieve continuous integration (CI)/continuous delivery (CD) and is helping legacy enterprises make the digital transformation with microservices.
The industries most frequently mentioned as benefitting from automated testing are: 1) automotive, especially automated cars; 2) healthcare – pharmaceuticals, medical devices, and patient monitoring; 3) telecommunications; 4) financial services– brokerage and algorithmic trading; 5) e-commerce; and 6) the Federal Government.
4. The ROI of automated testing is three-fold: time saved, fewer defects, and greater customer satisfaction and retention. Organizations can see 20 to 40x cycle time improvements by spreading the work across different machines. Going from weeks to days and days to minutes can yield 20x savings. Conservatively speaking, if you find a defect, it takes 10 times longer to fix in production than earlier in the SDLC. Catching errors earlier and more accurately saves a million dollars per year in developers not having to look for errors. For us, automated testing has had a direct correlation with customer satisfaction. The product is simply running better, and the customers are happier.
5. The most common issues affecting the implementation of automated testing is the corporate culture used to doing manual testing. Where manual processes have been used, people need to be retrained and turned into programmers — but management doesn’t want to ruffle feathers. Companies need to shift left and developers need to learn to write tests. Old style testers are not adapting to, let alone embracing, automated testing and AI. We need one comprehensive, automated, visible delivery process to share the feedback from the different tools.
6. The future of automated testing is the use of AI/ML to improve the process on several fronts. Testing will be designed for AI/ML to build predictable models and patterns. Testing will be a natural as writing code, which will be done by machines. AI/ML will be part of the solution as teams generate more data which will make the models smarter. With AI/ML, testing will be faster, more thorough, and will result in self-healing tests and self-remediating code. You will be able to use AI/ML to run every test imaginable in the least amount of time to ensure your code is always vulnerability and defect-free.
ML will also improve automated security testing, as securityteams are able to leverage historical vulnerability data to trainML models to automate the vulnerability verification process,thereby providing developers accurate vulnerability data in nearreal-time.
7. The three primary skills a developer will need to ensure their code and applications perform well with automated testing are: 1) test scripting skills; 2) understanding the use case of the application; and, 3) move left, beginning testing earlier in the SDLC.
Hone test scripting skills. Write small, simple tests. Recognize the different types of tests you will need to run at different points in the SDLC. Have the ability to write your own unit and regression tests. Know how automated tests are going to be written. Learn to write a model type of code from mentors and previous products.
Think about the use cases first and what the purpose of the code is. Understand users, domains, who they are, and what problem they’re trying to address. Have an overarching view of the application, what it’s doing, and how that impacts APIs and services.
Get rid of simple, recurring problems by architecting the system to be tested from the beginning. Leverage the testing methodology from the planning phase, and build richer code earlier in the SDLC.
Here’s who we interviewed:
Tom Joyce, CEO, Pensa
This article is featured in the new DZone Guide to Automated Testing: Your End-to-end Ecosystem. Get your free copy for more insightful articles, industry statistics, and more!
The following is an excerpt from a presentation by Chris Hill, Software Development Manager at Jaguar Land Rover, titled “Context Switches in Software Engineering.”
You can watch the video of the presentation, which was originally delivered at the 2017 DevOps Enterprise Summit in San Francisco.
I currently head up systems engineering for the current generation of infotainment. If you were to sit in one of our Jaguar Land Rover vehicles that were made this year or last or the last two years, you would use my current product.
Today, I’m going to share about context switches in software engineering, which is probably one of my favorite topics, specifically I want to share:
I like to visualize the software development life cycle in terms of a factory.
You have your inputs.
You have your planning, where your project managers would typically schedule and do your prioritization.
You’ve got your backlog where your change requests and your features come in.
Then you have flow control, whether not to release WIP or release work into your factory.
You are probably very familiar with these stations, requirements, design, develop, validate, deploy and ops. If you can envision all of the software activities that are done, all of the specialties.
Like many software startups, they were just starting their software product. I happened to be right out of college and I was the only software personnel that they hired. Unfortunately, that means I was in every part of the software development lifecycle. One benefit of operating this way is that I could operate with full autonomy.
That meant I was the only one who could interact with the entire factory, and one thing I didn’t have was the ability to plan on what I worked on.
I may come in at the beginning of the day and think that today was going to be an ops day. I may take an hour for customer needs and wants, and do requirements authoring. When I may have remembered that I’m about 75% of the way done with a bug fix, and realize that that’s a higher priority and switch to that. I may have some code that wasn’t published from last week that I know is ready for release, (this was back in the days of manual deployment,) and so I actually need to do some deploy work. If I’m really unlucky, since I’m terrible at design, I’ll get asked to do some HMI design work, maybe within the next hour.
Unfortunately, every day was a different station for me, and I was the bottleneck at every point of the factory.
Fast forward to JLR, infotainment specifically.
We’ve got a lot more people, and these people could either just contribute at their own station, they could be proficient enough to review other people’s work, they could potentially be proficient enough at going to another station, but typically more people will allow you to scale.
Imagine we’re all CPU’s. If we’re working on a current set of instructions, and another higher priority set of instructions comes into the queue, we need to save the current state of our instructions, address the higher priority set of instructions, finish it, take it off the stack, resume the state of the lower priority and finish executing that.
Humans do the same thing.
If I’m sitting at the development station or I’m working against the development station, and I’ve been asked to work on task number two even though I’m in the middle of task number one, if it’s the same project I’m going to consider it a lower penalty.
If you look on the bright side, I’ve got a “Barometer of Penalty.”
The next stage up in penalties is if I ask you to work on a different project.
Which happens to be the same station, happens to be the same type of work but it’s a different project. Now, I need to save the state of my previous task and previous project and ramp up on the next one to finish it if I’m told that it’s a higher priority and that I need to be working on right away. That’s a little bit higher on the penalty scale.
The next one is design, or a different station. If I ask you to do a different station or a completely different work type but I keep you on the same project — I’m hoping you’ll be a little more familiar because you know the project, but this is a completely different type of work so it’s a higher penalty on my barometer.
The last one is if you switch stations, which is your task type, project, toolset, maybe the computer that you’re using, maybe an operating system that you’re familiar with, you could even be asked to go to a separate building, etc. There are many other variables.
If you’re completely changing your environment and your variables, this is very taxing on the human mind. I’m sure we’ve all dealt with this at one point in time or another, but in terms of a CPU, they just have to worry about addition. In terms of a human, you have to worry about all these other variables. It’s almost like asking a CPU to cook me breakfast tomorrow. It has no idea how to do something like that, but at the same time it’s higher priority and I need you to address it right away.
Should we eliminate all higher penalty context switches?
The answer is, it depends.
We found that if you can actually sustain the capacity and remain efficient on the different specialties, then you can actually avoid these higher penalty context switches with capacity in those specializations.
My favorite question: should we train cross-functional teams or train cross-functional people?
The difference between those is somebody who could work and be proficient at multiple stations, or a team that is capable of sharing empathy, that each one of them can be specialized at their own station.
Which one is more of a worthwhile investment?
Are some stations and roles easier to switch to than others? Do some roles understand other roles better than others?
Here in infotainment, these are the specialties or the roles that contribute in our software factory.
You’ll probably recognize the majority of these because they match typically in the industry.
value contribution areas within our factory
First up, Flow control station: Idea to WIP.
I went around and I asked my coworkers and I asked other people in the software industry the question that’s defining those arrows. I call those arrows empathy and proficiency arrows.
Out of all the product owners that you know, on average could they step in and be proficient at being a customer for that product?
Out of all of the project managers you know, on average could they step in and be proficient at being a customer?
Now I know that’s a complicated question. However, the green arrow symbolizes a yes, the red arrow symbolizes a no. We found that the relationship, in this case, is a highly empathetic relationship towards the customer.
These are the primary actors that exist specifically within this flow control station, we’re trying to determine whether or not WIP should be released into our factory.
I’m not saying these specialties can’t do each other’s jobs, I’m just saying on average these are the results. Typically what happens in this particular station are what I call micro-consulting engagements, and that’s where we’re actually interrupting all of those other specialties to determine whether or not we should release WIP. All of those interruptions are all contexts which are on their own as well.
What’s interesting is if I’m sitting at my desk and I’m absolutely crushing out code, and I’ve got my headphones on, completely in the zone, and somebody walks over to my desk and interrupts me, they’re automatically assuming that what they have to talk about is of higher priority than what I’m currently working on.
I don’t think that’s fair. In fact, I think they should rank whatever they’re going to talk about.
In the CPU’s case, all they have to worry about is addition and this queue line. I kid you not, I’ve actually had a queue line at my desk full of people who were going to interrupt me.
Typically that prioritization is something of the equivalent, had the CEO been in the queue line I would imagine I’d treat it like a CPU treats a real-time threat: you can come to the front, what do you have to say?
The next station is the requirements station.
The same relationship exists. One interesting thing I’ve found is that customers on average aren’t good at writing requirements, specifications or even exactly what they want.
They’re very good at talking about needs, talking about wants, talking about vision, but typically when it comes to handing over to a developer, most of the time I’ve found it’s not enough.
They have the same sort of micro-consulting engagements that the previous station did, again all interruptions, to ensure that the requirements being written are not going to be impeded further on downstream.
The next design station is “How should we build this?”
This is an interesting one.
This is design, and design to me can be in two different categories. Design is super overloaded. It could be architecture, it could be HMI design.
And what you see here is a lot of red arrows. Basically, I asked my coworkers again: out of all the architects you know, on average would they be proficient at being an HMI designer? The answer was no. The reverse relationship exists, as well as the same thing exists within the customer.
What this actually can show is there are some automatic friction points that exist between these specialties. This could also show you that we could spend some time to make them a little bit frictionless, or maybe we could spend the time developing a relationship that doesn’t have to do with the product or the project, but with the people in general.
Typically there are validation engagements that happen, which are also interruptions. For example, one of the UX designers has a trail-off based off of how much effort they plan to put on a product. When they’re finished with their wire framer, or with the majority of iterations, they are putting remaining capacity for these interruptions. They’re adding it into their workflow, which I thought was pretty smart.
The same consulting engagements exist further on downstream.
In the develop station, if we ask ourselves the same question: out of all the developers you know, on average could they fulfill a QA engineer role and be proficient at it?
While a lot of people in these specialties don’t necessarily want to be specialized in one of these areas however they could. We get double greens here across all three of these.
This is one thing that contributes to the value of DevOps — all three of these specialties understand each other’s risk, understand where each other’s coming from, understand what they could do to help the other person complete their task.
Validation engagements exist. We’ve migrated from design or theoretical, and now we’re at implementation. Most of these engagements are “Hey, I went to build the thing you told me to build or the thing that you wrote out, and it’s not going to work for me. It’s definitely not working out.”
All of our build machine infrastructure is all done in Docker containers, it’s all identified within Packer.
Each one of our developers who are contributing towards a product, if they have some sort of library or some sort of component they need the build machine to have, can go right to a Packer recipe, create on a branch completely in isolation, make their new set of infrastructure, point their product to build with that new set of infrastructure ALL without bothering anyone else in the entire workflow or disrupting anyone else.
Here, the ops has enabled self-service for the developer to completely work on their own, test whatever they need to do. “I’ve got this new version of Python I need to put on the build machines.” “Okay, there’s the Packer repository, go ahead and do it.” We have CI on that Packer repository. We get an automatic deployment to a Docker registry. That Docker registry is pointed to by the product.
Another way we exploited double green arrow is we have automated test feedback, with CICD pipelines. We can put test cases into an automated pipeline so that developers can get the results back quicker.
Validation and deploy stations are the same type of relationship. However, your primary actor is typically the QA engineer. There are validation engagements that also exist when you’re in the QA phase.
Sometimes the validation engagement could be, “Should we ship this or not?” or “Should we disable this maybe in production before we actually let it out?” One thing that’s unique about developing for an embedded device is we can actually put it into a production representative vehicle without officially saying that we’ve released things. It’s very difficult for us to compare to the web world because in the web world we can release everything out to millions of customers at scale very quickly. For us, we contribute toward an embedded device or an OS that runs on an embedded device, and there’s a point in time at which we bless that for a release.
One way that we exploit specifically for validation and deploy stations is virtualize and simulate the production environments so that we don’t have to use hardware. One of the challenges with hardware is it typically doesn’t scale, or by the time you’ve scaled it for what your team demands it’s already outdated.
Here’s the ops station. The only surprise here for me is actually the architect. Most of the average architects we’ve found could be proficient at an ops role. Now, that’s not necessarily whether they want to be, but they could be.
Here are the lessons that we’ve actually learned:
Attend the DevOps Enterprise Summit, Las Vegas: October 22-24, 2018
The following is an excerpt from a presentation by Cornelia Davis, Senior Director for Technology at Pivotal, titled “DevOps: Who Does What.”
You can watch the video of the presentation, which was originally delivered at the 2017 DevOps Enterprise Summit in London.
Throughout the years, I’ve had the great opportunity of working with very, very large enterprises across all verticals. My background is as a technologist, I’m a computer scientist, and initially, I spent a lot of time talking tech at the whiteboard. But then I realized that there was so much more that needed to change, which is why I’m sharing now about the organizational changes that can support our technology needs.
We have different business silos across the organization and different individuals that are coming from those silos. When we have a new idea for a product, we kick off a project and individuals go into the project to do some work.
The first individuals from the first silos come in, and they generate their artifact. Then what do they do? They throw it over the wall to the next step. If you look at this slide below you’ll notice once they’re done they leave the project.
If for some reason we have to go backwards, we have to figure out how to get them back into the project. And, so it goes through each silo. We all recognize that this is a slow and challenging process. If it only moved linearly, it might be okay. But we all know that it goes this process goes backwards, and forwards, even circular!
But that’s not even the biggest problem of these things.
The biggest problem is that each one of these organizations are incentivized differently.
My favorite examples are App, Dev, and QA — so let’s look at these.
Application Development is almost always incentivized by ‘Did you release the features that you promised on time, and ideally on budget?’ And, if you released the features on time you get a ‘Way to go, you achieved your goals.’
Then it moves over to QA.
What is QA incentivized on? Well, they’re responsible for quality. So they are generally incentivized by the number of bugs that they have found and fixed.
Now, let’s look at these things in combination. What happens when the application development process starts to fall a little bit behind? Developers start working late into the evenings. They work on weekends, they start working very unsustainable hours, and what happens? Quality suffers, but they hit their features on time.
Well, when they throw that over the wall to QA, what’s gonna happen now?
QA is going to find more bugs. Way to go! So we’ve got locally optimized metrics that do not create a globally optimized solution. That’s a big problem.
Well, the answer is really simple…
What we’re going to do is we’re going to center things around a product and the product team is incentivized to deliver value to a customer, to deliver value to some constituency.
For example if I’m in an e-commerce scenario:
I have a product team that is really about the best experience around showing product images, recommendations, soliciting reviews, or it could be some back office product that is enabling your suppliers. These are all the different product teams.
There’s been a lot of research, and a lot of discussion, and a lot of proof points that product teams are really the way to go.
But what if we don’t have product teams? What if we have different roles within the SDLC, how do you create product teams of these different disciplines to come together into a product?
If you have been living in a cave for the last 10 years, and you don’t know what the sorting hat is, this comes from Harry Potter. When new students come to Hogwarts School of Witchcraft and Wizardry on their first day, each one of them places the hat on their head and they get sorted into one of four houses, and that’s the house that they live in for the next seven years.
So we’re going to take those roles and we’re going to sort them into houses. But the question then is, what are the houses that we’re going to sort into? So let’s take a little bit of a tangential ride over to the side and think about a couple of houses. (I’m going to end up with four houses in the end, but I want to start with two.)
The left part of this slide you’ve all seen for the last several years.
That is where we were maybe 15 years ago. IT was responsible for the entire stack from the hardware all the way up through the application.
Then VM Ware came along and virtualized infrastructure.
Then a whole host of people made infrastructure as a service available. Amazon web services of course being kind of the behemoth of that.
That made it so that we could just get machines, EC2 machines for example, and then we could stand up everything that we needed on those machines. Getting machines was easy.
Then in the last five years or so, we’ve taken that abstraction up another level and we’ve created application platforms where we have individuals who can be building applications, and the only thing that they need to worry about is their application code.
What’s important about that application platform is that it generates a new set of abstractions. Those abstractions are at a higher level. They are fundamentally the application, or maybe some services there in support of that application, and it allows us to not do things like implement security by creating firewall rules that machine boundaries, but instead allows us to implement security at the application boundary.
This new abstraction is one of the key things that’s happened in platforms over the last five years. It’s given us something really interesting and really important. It’s allowed us to define two different teams. And it’s defined a contract between those teams that allows these teams to operate autonomously.
When we hear about all of the different goals of an enterprise, they all talk about needing to bring software solutions to market more quickly, and more frequently. So agility, and autonomy, and teams is incredibly important. We’re always looking for those boundaries where we can create more autonomy.
The team that’s going to create the next mobile app or the next web app or even some analytics app for example, can focus on building that application, and they don’t need to worry about even the middleware that sits below it.
They’re responsible for creating the artifact. They’re also responsible for configuring the production environment, deploying to production. They are doing Dev and Ops. It’s not necessarily the same person, but it is the same team. They’re deploying to production, they’re monitoring. When they notice that they need more capacity, they’re scaling so that they can achieve better performance. They deploy new versions when they need to.
It’s entirely up to them.
So that’s the team that’s providing the platform, and notice that they’re doing exactly the same things.
They are deploying the platform, they’re configuring it, they are monitoring it, they are upgrading it when they need more capacity, or upgrading it to the next version. They’re doing the same things but they have their own products that they’re working on. So the product orientation is really key.
This separation gives us the first two houses that were going to sort into. The APP team, and the platform team.
Now let’s take all of these roles that come from traditional organizations and start sorting them. And, so here’s our two houses, the APP team and the platform team.
We’re going to do this piece by piece and I’ll explain the steps as we go along.
The first ones that we’re going to do is we’re going to start with the purple bubble there. Before I sort them, notice that this Middleware and App Dev team is actually taking care of both the Middleware and the application development.
In retrospect, having worked in this new world for the last five years, I find this kind of counterintuitive because why would somebody who’s creating an application i.e., using the middleware be in the same group as the middleware itself? To a large extent, it’s because in the past middleware required a great deal of expertise. You had to know a lot about the middleware to be able to effectively program against it. That’s something that we’re trying to move away from. We’re having more agile middleware platforms and so on.
Notice what happens here.
We’ve got middleware and we’ve got App Dev, and we break those apart. We put the middleware engineers inside of the platform team. They’re part of that team providing the capabilities that the APP team can then use. Then we take kind of a full stack application development team and put them up in the APP team. We’ve got front end, and we’ve got back end. All of those individuals are there.
That one’s pretty straightforward.
The next one that’s also pretty straightforward is we’re going to pull some of the folks out of the infrastructure team, the folks responsible for building out the servers and the networks.
You might have noticed I put virtualized infrastructure and platform together in one team that many of our customers actually keep those as separate, but in this case it really wasn’t important to make that separation. You could be separating the platform team into two separate individual ones as well. The thing that I would caution you is you need to make sure that you then have a very crisp contract between the platform team and the infrastructure team.
I’ll be honest with you, that’s a little bit harder to find at the moment, so that’s part of the reason I’ve put them together.
Again, server build-out, network build-out, they are part of the platform team providing the view of the infrastructure up to the App team.
The next one that we’ll talk about here is what I like to call the control functions.
There is information security for example, and change control. Why did I move them at the same time? Change control was usually often coming out of the infrastructure team, and information security coming out of the chief security office. I moved them at the same time because they share a common characteristic. They are functions that today can stop a deployment. They are functions that on every release, on every release into production, they need to give their blessing.
We’ve seen when it comes to the very end, and we find problems in information security or any other types of security, it can actually stop things. There’s a great huge ball of things that we need to check off.
These functions here, information security and change control should engage with your teams that are providing the platforms, and the automation around the deployments to ensure that their concerns are satisfied. Their concerns are not wrong. It’s just the way that we’ve been solving them is something that’s in need of transformation.
In Part 2, we’ll talk about Ops.
DevOps is quickly evolving as foundational capability to get business and IT agility. While most companies are able to adopt DevOps automation (CI/CD pipeline) at the team or system level, they are not able to scale it at an enterprise level.
Like the Agile practice of Scrum needs to be built over with enterprise frameworks like SAFe, Scrum of Scrum, LeSS, DaD, etc., E-DevOps also needs to be built on certain key framework principles.
Below are some of the key framework principles that one should consider for successful DevOps at the enterprise level:
An organization should define a standard continuous delivery (SCD) procedure per strategic technology. In addition to defining an SCD pipeline per technology, some addition points to consider are:
There are several models for managing platforms for a CI/CD pipeline. As with any model, it’s a trade-off between flexibility, standardization, support model, and cost factor.
A cloud model would be using the existing CD-as-a-service by cloud providers. Central hosting can be on-premise or on the cloud but is managed by the client. Team setup is where each team creates and maintain its pipeline. The base image of the pipeline can be managed by centrally. Containerization is just an extension of this model with a different deployment strategy.
Support for CI/CD tooling at least the SCD (baselined) should be centrally controlled. The tool’s lifecycle management (LCM) support with guidelines and coaching of the team for adoption can be defined on the model.
There are several models for looking at CI/CD maturity. It’s good metrics to measure depth and breadth per system/technology so that adoption can be tracked. We see integration of security, risk, and other operational aspects as part of the pipeline.
While we will continue to see DevOps itself evolve into new mature avatars like DevSecOps, ChatOps, BotOps, NoOps, etc., it is important that foundation enterprise setup is done right.
Here are some highlights:
As a bonus, Aloisio graciously answered some questions from the audience with details on how they got their 4000% increase in performance (his numbers, not ours).
A: It took about a year and a half with an iterative process. This included evaluation of tools, whiteboarding, learning the tool, building the pipelines, and some refactoring. And also developing trust in the automation pipelines.
A: No we did not. It just took time to get to a stable, trustworthy and valuable automation pipeline.
A: Yes, we had training and some professional services consulting from Electric Cloud in the beginning to help us understand the tool and how we could use it to fit our specific needs and environment. The rest we did on our own.
A: At first we started the project with a few people that were only partly dedicated to it. So freeing up those people was one hurdle. It had a huge impact when we went for creating a real team with these people, dedicated to the task.
-Getting buy-in from stakeholders to help drive the initiative was another one. We were not showing progress and value fast enough because of the way we worked. Our initial approach was to build the automation pipelines as complete as possible before delivering. This led to a longer time before we got feedback from users, buy-in from stakeholders, and a loss of momentum.
We changed our approach and started working with MVP’s (minimum viable pipelines) instead. Which led to faster feedback, which in turn led to more stable and trustworthy automation and faster time to value, which got us buy-in.
A: If you have just started, try to show value or potential early. Prototype it or MVP it. Another key is to get feedback early and consistently – it is critical for building stability and achieving trust in what you are doing.
-It does take time to do something like this, so it is important to manage expectations, but also like I mentioned to show constant improvement and value deliveries. This is key in getting the users and stakeholders to buy into the future potential, the end goals, and to make it tangible.
-Value delivery analysis and/or value stream mapping are always key tools in these types of initiatives. But keep in mind that they can get very large, complex and sometimes not even understandable depending on the level you are doing them. It makes sense sometimes to split or layer these mappings so that you can get a clearer picture of were to focus efforts and then start to work more efficiently.
So there you have it:
To see how you, too, can enjoy up to 4000% faster deployments and still have your evenings and weekends anxiety-free, watch the webinar!
Remember to check out Electric Cloud and DZone’s DevOps Toolchain podcast taking place on July 17 at 10 am Pacific Time! Electric Cloud’s Anders Wallgren and DZone’s Tom Smith will be joined by guests from SmartBear, Atlassian, New Relic, and GitHub to discuss the current state of each stage of the DevOps toolchain. You can learn about the Continuous Discussions podcast here!Original Link
On a recent DevOps.com webinar, Gene Kim, founder of IT Revolution and co-author of “The Phoenix Project,” “The DevOps Handbook” and “Accelerate” and Anders Wallgren, Electric Cloud CTO, shared concrete tips for bridging the gap between Dev and Ops. During the discussion, each shared new trends in organizational structures, the importance of value stream mapping, tips for pipeline monitoring and tracking, and much more.
Continue reading for some of the top takeaways!
Wallgren speaks about the dichotomy of developer and operations teams and why the tension between the two exists:
“On the developer side, it’s really about creating new technologies, new features, experiment, move fast, change the tooling as new tools become available and are easier and better to use, etc. The motto for Dev is change the status quo, because we need to do more releases and features. Whereas on the Ops side, it’s more about maintaining the status quo. The goal is very much to keep the lights on and keep the money flowing in, and kind of be the engine for the company.” Wallgren goes on to explain how DevOps is helping operations teams handle the throughput and constant change from developers.
So, how do we bridge Dev and Ops and all the other organizations within a company? Wallgren explains that it comes down to understanding that we’re really all working towards the same goal:
” The overarching goal for everybody in the company is to provide value to our customers. And then, in return for that value, you get money or respect or fame or, whatever it is that we’re looking for. Knowing and understanding that there is this shared goal of what we’re trying to do and that we aren’t safely siloed in our own organizations and don’t have to care at all about what comes before or what comes after is a very strong belief in the DevOps community as well as shared visibility. For all these different stakeholders and organizations to come together and smoothly and efficiently and repeatedly interoperate and deliver value to our customers, that involves everyone. It is a development problem, it’s a quality problem, it’s an operations problem, all the lines of business have to participate, even HR, finance, and partners, and so on have to get involved in this.”
Referencing the image above, Kim explains the organizational evolution to Model 3:
“What we’re often seeing is the shift to Model 3 – product orientation or cross-functional organization. So, beyond technology, we typically do this when we want to optimize for quickly delivering value to customers. That means that any group can independently deliver value to customers because they have all the disciplines within their teams. They don’t need to beg, borrow or steal resources to do things that customers demand. By pulling people out of the silos, they have genuinely shared goals. The idea is that you get almost all of the benefits of model three without having to reorg everybody.”
Kim goes on to share some recent work about to be published around ticketing systems and makes a resonating statement:
“It is my big learning share that it doesn’t really matter where you are in the ticketing system, whether you are the producer of tickets, the consumer of tickets, or the doer of the tickets, when you’re trapped in this Byzantine labyrinth, it is sometimes a very joyless activity.”
Wallgren emphasizes the importance and value of value stream mapping:
“Value stream mapping helps you figure out, “what are we doing?” How much of this is productive work and how much of this is just wait states where there’s three hours of work being applied here but it’s three days of duration? Can we make that not be three days and take the wait states out of the system? I think part of the reason value stream mapping is so valuable and so useful is, it gets all the collaborators and all the stakeholders literally in the same room to walk through from when code gets checked into how it makes it into production.”
When modeling your pipelines, there are many things to take into consideration, but the more your pipeline looks like your value stream, the better. Wallgren goes on to share:
” If you have your pipeline put together based on your value stream and you’ve coded it, you now have documentation of how you do things. And if you then show that you ran that pipeline and that’s how you built and qualified, tested, released your code, then that is the proof that you do what you document.”
“Tisson Matthew as an engineering director within the transportation department had to cut across over 300 functions across the Amazon enterprise. It was one of the most heroic stories just because of the level of leadership and political sophistication needed to do that. There was no central planning function that could make sure that Tisson got what he needed. Basically, he was having to work with 300 parochial selfish people who only cared about their area and cared less about this new idea that’s been kicking around. So essentially, he had to use political skill and his Jeff Bezos card to emphasize that this is the most important initiative and they would use that to scramble the priorities. I share the story because I think we’re moving towards this product orientation [Model 3 above] where teams are more capable of delivering value to customers themselves. But for something like Amazon Prime Now, no one team could deliver Prime Now. They had to cut across essentially all of Amazon.”
Helpful hints on pipeline tracking from Wallgren:
“If we work on integrating data tracking as part of our pipeline efforts, then it becomes a very powerful tool for collecting and then reflecting on a lot of the interesting information. When you’re thinking about pipeline tracking, try to automate as much of the collection of data from the point tools as possible and the environments and all the pipeline jobs that run through it, whether it’s a build CI type pipeline or release pipeline or what have you.”
Architecture is at the root of productivity and joy in the day-to-day work of software delivery, explains Kim:
“These days what the findings show is that architecture has everything to do with how people do their daily work. Is it done on demand or are we hamstrung by ticketing systems? Everything should be done on-demand, self-service, not through ticketing systems and waiting two weeks, four weeks, six weeks to release. It’s those characteristics that allow us to get immediacy and fast feedback. Those are the conditions that allow us to get focus and flow not just in the Lean sense but in the sense of these are the conditions that allow us to have a mental state of effortless productivity and joy.”
Believe it or not, these tips and insights above don’t even begin to scratch the surface. To gain more invaluable information shared from Gene Kim and Anders Wallgren during the webinar, watch the entire replay here.Original Link
The following is an excerpt from a discussion that was held at DevOps Enterprise Summit London 2017 by Amine Boudali, Senior Project Manager, Nordea and Jose Quaresma, DevOps Lead DK, Accenture about the experience and lessons learned when setting up the Core Banking Platform in a containerized environment. You can watch the video of the presentation here.
We want to share with you our journey through core banking, how we’ve implemented the platform, some of our challenges and how we’ve gone about easing the pain we were facing to tackle them.
But first, a brief introduction to Nordea.
For this initiative, we had a focus on delivering incremental and frequent business value but also positioning Nordea to be the digital bank of the future.
In order to achieve this, Nordea partnered with Temenos as a software provider as well as Jose Quaresma’s team at Accenture as a system integrator.
Since the beginning of the project almost two years ago, we maintained a big focus on the following guiding principles:
#1 — Automation
Automation was a very big goal for us. This was both on the environment provisioning, on the test automation, and also on our CICD pipeline.
#2 — Everything As Code
Then we also had this goal of having Everything As Code, and here it’s both for the environment configuration but also again for the delivery pipeline. You really want to be able to have everything in the repository to be able to replicate things and have better control on the changes being made.
#3 — Short Feedback Loops
This would enable the developers and testers to get quick feedback on whatever they were doing and with what they were testing, which is helpful for allowing them to learn fast to fail fast so that they can move forward with their development and testing.
For our first goal, automation, while we weren’t in CICD, we were on our way there. We had continuous integration, we had automatic build and deployment to our development and test environment. But we really wanted to focus on pushing that further up the environment so we could get the full advantage of the work we were doing there.
From the Everything As Code perspective, we did not have any configuration management in place. We were using a custom made solution at Nordea, which meant that having Everything As Code was still something that we were striving towards.
Finally, from the perspective of our goal for Short Feedback Loops, we had daily builds and deployments, which was something that we’re very happy with considering the complexity of the system, but there were some challenges that prevented us from taking this further.
For example, there were some intricacies of deployment that actually required us to restart the WebLogic servers when we are deploying. So if we wanted to build and deploy — every time there was a change, we would run the risk of having an environment that is more often being restarted than actually up! That’s not helpful with the short feedback loop.
As we continued on our journey, we had some big challenges to overcome.
So, where did we end up?
With long environment provisioning, we went from weeks and months to under one hour.
This was amazing for us. One hour.
Here we are not talking about a database provisioning, we are not talking about an application server provisioning. We are talking about a full-fledged environment, in under an hour by using this solution.
We’re pretty happy with that.
For the proof-of-concept challenges we had, we developed better life-cycle management. In the past, standing up an environment and decommissioning it required that you go back to these units to decommission it or reuse it, which meant that you would need to sequence your proof-of-concept.
With this system, we worked to rev up the environment, used it for the proof-of-concept so that there were no dependencies, and we can decommission it when we want to.
Now onto the challenges we had with complex deployment and orchestrations.
Here, for the first time, we were able to use this product to do live deployments. Which meant that this came down from one hour of downtime to zero downtime. Of course, this is not production-ready. We still have small things to iron out from a business perspective, but we are able to do live deployments.
With our fragmented environment configurations, this is a full infrastructure as code. We talk to the developers and testers and the teams that are developing on this platform, they help us to improve the environment so they can actually put in requests or merge request and then we review that and take them in. This is more to bring them into the world of how we provision the environments.
We shortened feedback loops. With the ability to provide that end-to-end integration to the core banking system, we were also able to do frequent deployments to the development environment but also concepts such as time travel. This concept gives us the possibility to do the end of year reporting, the end of month reporting, or for example interest accumulation — ahead of time. We don’t need to wait for that time to do it. We are able to basically fast-track the time from today and do those type of testing in order to ensure the quality of the product.
Our answer, and we’re not by any means saying that it’s the answer, was to use Red Hat OpenShift platform to start this transformation.
We had two teams of about 5 people, with a few people were focused on installing the platform, and then we had the others focused on setting up of the application into the new platform.
Here the migration was very much a lift and shift migration. We didn’t want to be thinking about which technology stack we should really be using in this platform, but more “let’s grab the technology stack that we have right now running, and see what it gives us.”
This took us around half a year, and what we ended up with was a setup where we have a core banking application project the OpenShift sense, that mostly consists of three containers:
Currently, we have around 30 of these projects running in our development and test environments and being used by both teams to test features that they want to play around with, they can use it for the end of year reporting or testing using Time Travel Features, etc.
Now let’s take a look at what this looks like, here is a picture to illustrate how the live deployment with OpenShift works:
Here we are combining the built-in features from OpenShift with the application. Here we have a deployment in progress and the one you see on the left is the container that is currently running. That’s where the traffic is being routed through. But then you have a container on the right that is being deployed with the updates, and that one is just starting.
What OpenShift is doing here for you is that while the container on the right is starting, the traffic will go through the one on the left, the old one, but then once OpenShift sees it — a new one is ready and deployed, which shifts the traffic to the new container and kills the old one.
We do have new challenges that we are now starting to focus on.
First, there was a certain lack of awareness within the different teams of the platform. The teams are very busy working on the different features and the things that they have to release. Then we came in and we have this platform with all these new features. So we’ve shifted to thinking more about how we communicate with the teams and inform them about this new platform and the things that they can do with it.
Another very important paradigm shift happening is around “treating our servers as pets to treat them as cattle,” to steal the famous analogy. If you’re doing manual changes in the servers, they might be gone in half an hour, or whenever you do the next deployment. It’s key to have that mindset shift.
Next, we are pretty aware that we do have some heavy containers running on the platform. While the system is not heavier than it was before when it was running in a more standard VMware kind of environment, it is still very heavy.
Which means that it doesn’t allow us to take full advantage of a containerized environment where we could have a very quick way of loading environments. Instead, we have containers that are slow to load, that are quite heavy, and fairly big.
London: June 25-26, 2018
Looking forward what we will be thinking about how we manage our users, customers, but also people, and figuring out when we’ll say that this feature is ready for you to use.
Currently, the platform is in a Dev and Test environment, but we’re hoping to bring it to a production-ready state sometime this year.
Rosalind Radcliffe is an IBM Distinguished Engineer responsible for DevOps for Enterprise Systems. She helps navigate transformations for both IBM and their clients.
Below, we’ve transcribed the key takeaways and main highlights of her presentation where she shows how a traditional z/OS product that’s been around forever can also transform (which you can also watch on YouTube here.)
When we started this process, we had release cycles that varied from relatively short to 18+ months. Our goal was to bring all of our z/OS development tools together in a single delivery pipeline in order to:
With our wonderful set of products to delivering together instead of separately, we’d have a full DevOps pipeline for z/OS Explorer and all the products that sat on top of it that could give us the value and give you the value, and make it easier for all of us.
We created a single development pipeline.
Now this was a challenge. We have 17 separate products. They actually are still separate products, and in some cases, they’re totally separate. But we wanted the IDEs and the environments to be built together, so we literally stopped development for a while and said, “Other than critical customer situations, we’re not gonna do anything else. We’re gonna build a delivery pipeline.” And it took about four months.
We decided to do this in pieces. So we started with the base set of products; so z/OS Explorer plus CICS tools to get that base there. Then, we added products throughout the year, and we continue to add products.
So any product that has an Eclipse-based IDE, that is related in any way, shape, or form to mainframe development will end up on this pipeline.
The important thing to recognize about this is, while yes, it’s Eclipse-based IDEs, it’s also the backend pieces of the development for this. It’s not just Java development, it’s traditional PLX development that we use internally.
It’s an assembler development.
And we have the DevOps pipeline for all the parts of this. We don’t use separate tools. We have one toolchain, whether or not you’re building the Z side or the distributed side, we have one SEM. No separation, all the code is together. Everybody can see what everybody’s working on, etc.
We want to make sure that the pipeline is consistent, that when you’re doing DevOps, it’s doesn’t matter if you’re doing DevOps for Z or distributed, it’s one story, one set of processes, one set of capability.
Well, we had an advantage. We had the CICS team who had already started. They had started in 2005. So we had a lot of good lessons learned on how to do things and how to do this transformation.
But, what we ultimately learned was that:
As you can see, there is no reason that the mainframe should not be included in your DevOps transformation, so please remove the line ‘mainframe excluded’. I need everyone to help the mainframe developers understand that, yes, the mainframe can do DevOps.
We need to kill this concept of two speed IT because it doesn’t work. Clients explain that it doesn’t work. We need to help everyone understand multi-speed IT is really what we need.
John Allspaw is co-founder of Adaptive Capacity Labs and former Chief Technology Officer of Etsy. As an engineering leader and researcher with over 20 years of experience in building and leading teams engaged in software and systems engineering, Allspaw has spent the last decade bridging insights from Human Factors, Cognitive Systems Engineering, and Resilience Engineering to the domain of software engineering and operations.
Also the author of two books, “The Art of Capacity Planning: Scaling Web Resources” and “Web Operations” (O’Reilly Media), Allspaw continues to contribute to the IT and DevOps communities through speaking and collaboration on new, exciting research.
In fact, we were lucky enough to host John at the last DevOps Enterprise Summit in San Francisco, where he took to the stage to talk about “How Systems Keep Running Day After Day.”
Below, we’ve transcribed the key takeaways and main highlights of Allspaw’s presentation, enjoy!
What I want to talk about is new. It is different, and I feel very, very strongly about this.
To help set the stage, my thesis for my degree in Human Factors and System Safety was “Trade-Offs Under Pressure: Heuristics and Observations Of Teams Resolving Internet Service Outages.”
Some of you may have heard of this, what’s called the Stella Report.
At a high level, this report is the result of a year-long project of a consortium of industry partners. IBM, Etsy, and IEX, trading company, a trading exchange in Manhattan. Over this year, folks from the Ohio State University Cognitive Systems Engineering Lab, David Woods, Richard Cook, and a number of other folks looked deeply at an incident in each of those organizations.
They found these six themes and were common across all of them.
Certainly, the results are quite important. It’s how that research was done that I want you all to take a look at.
Here are my main takeaways from the report:
First, I want to start with a little bit of a baseline, a bit of a vocabulary that’s going to be important as I sort of walk you through this. I’m going to describe a sort of picture, a representation, like a mental model of your organizations, and it’s going to have an above-the-line region and a below-the-line region.
If you imagine what we have depicted here, this is your product, your service, your API, or whatever your business derives value from and gives to customers. Okay? Inside there, what you see is your code. You see your technology stack. You see the data and some various ways of delivering this, right? Presumably over the internet or some other sort of way. But if we stay here, nobody’s going to believe me that that’s what we call the system, because it’s fine, but it’s not really complete.
What’s really connected, and what a lot of people have been talking about here in the DevOps Enterprise Summit community is all the stuff we do to manipulate what goes on in there, and so we have testing tools. We’ve got monitoring tools. We’ve got deployment tools and all of the stuff that’s sort of wired up. These are the things that we use. You could say that this is the system, because many of us spend our time focused on those things that are not inside the little bubble there, but all of the things that are around it, but if we were to stay just with this, we won’t be able to see where real work happens.
What we’re going to do here is, we’re going to draw a line that we call the line of representation, and then dig a little deeper. What we see here is you. All the people who are getting stuff ready to add to the system, to change the system. You’re doing the architectural framing. You’re doing monitoring. You’re keeping track what it’s doing, how it’s doing it, and what’s going on with them.
Now, you’ll notice that each one of these people have some sort of mental representation of what that system is. If you look at it a little bit more closely, you’ll see that none of them are the same. By the way, that’s very characteristic of these types of roles. Nobody has the same representation of what is below the line.
To summarize, this is our model of the world, and it includes not just the things that are running there, but all of you, the kinds of activities you’re performing, the cognitive work that you’re doing to keep that world functioning. If we play with this a little bit more, we end up with this kind of model. This model has a line of representation going through the middle, and you interact with the world below the line via a set of representations.
Your interactions are never with the things themselves. You don’t actually change the systems.
What you do is that you interact with the representation and that representation is something about what’s going on below. You can think of those green things as the screens that you’re looking at during the day, but the only information that you have about the system comes from these representations. They’re just a little keyhole. Right?
What’s significant about that is that all the activities that you do, all of the observing, inferring, anticipating, planning, correcting, all of that sort of stuff has to be done via those representations, so there’s a world above the line and a world below the line, and although you and we mostly talk about the world below the line as if it’s very real, as if it’s very concrete, as though it’s something that that’s the thing, here is the surprise.
Here is the big deal – you never get to see it.
It doesn’t exist. In a real sense, there is no below the line that you can actually touch. You never, ever see code run. You never, ever see the system actually work. You never touch those things.
What you do is that you manipulate a world that you cannot see via a set of representations, and that’s why you need to build those mental models, those conceptions, those understandings about what’s going on. Those are the things that are driving that manipulation. It’s not the world below the line that’s doing it. It’s your conceptual ability to understand the things that have happened in the past, the things that you’re doing now and why you’re doing those things, what matters, and why what matters actually matters.
Once you adopt this perspective, once you step away from that the idea that below the line is the thing you’re dealing with, and understand that you’re really working above the line, all sorts of things change.
What you see in the Stella Report and that project and other projects that we’ve been engaged with is taking that view, and understand what it really means to take the above-the-line world seriously. This is a big departure from a lot of what you’ve all seen in the past, but I think it is a fruitful direction that we need to take.
In other words, these cognitive activities (see below) in both individuals and collectively in teams up and down the organization are what makes the business actually work. Now, I’ve been studying this in detail for quite a while here, and I can tell you this. It doesn’t work the way we think it does.
Finally, to set this frame up, the most important part of this idea is that all of this changes over time. It is a dynamic process that’s ongoing. This is the unit of analysis. Once we take that frame, we can ask some questions. We can ask some questions about above the line like this.
“How does our software work really, versus how it’s described in the wiki and in documentation and in the diagrams? We know that those aren’t comprehensively, they’re not comprehensively accurate.”
“How does our software break really, versus how we thought it would break when we designed safeguards and circuit breakers and guardrails?”
“What do we do to keep it all working?”
Question: Imagine your organization. What would happen if today at six o’clock all of your companies took their hands off the keyboard? They don’t answer any pages. They don’t look at any alerts. They do not touch any part of it, application code or networks or any of it. Are you confident that your service will be up and running after a day?
The question then is how to discover what happens above the line. Well, there’s a couple things. We can learn from the study of other high-tempo, high-consequence domains, and if we do, we can see that we can study incidents. (Note: when I say “incidents,” I mean outages, degradations, breaches, accidents, near-misses, and glitches – basically untoward or unexpected events).
What makes incidents interesting? Well, the obvious one is lost revenue and reputation impacts on a particular business. I want to assert a couple of other reasons why incidents are interesting. The one is that incidents shape the design of new component subsystems and architectures. In other words, incidents of yesterday inform the architectures of tomorrow. That is, incidents help fuel our imaginations on how to make our systems better, and therefore what I mean is, incidents below the line drive changes above the line.
That’s the thing. This can cost real money. Incidents can have sometimes almost tacit or invisible effects, sometimes significant. Right now, a lot of people are splitting up a monolith into micro-services. A lot of people do that because it provides some amount of robustness that you don’t have. Where do you get that?
You’re informed by incidents.
Another reason to look at incidents is that they tend to give birth to new forms of regulations, policies, norms, compliance, auditing, constraints, etc. Another way of saying this is that incidents of yesterday inform the rules of tomorrow, which influence staffing, budgets, planning, roadmaps and more. Let me give you an example: In financial trading, the SEC has put into place Regulation SCI. SCI, is probably the most comprehensive and detailed piece of compliance in modern software era. The SEC has gone and been very explicit. We have this as a reaction to the flash crash of 2010 to Knight Capital, BATS IPO, Facebook IPO. It is a reaction to incidents.
Even if you go back a little bit further, it’s often cited that PCI DSS came about when MasterCard and Visa compared notes, realized they lost about $750 million over 10 years, so incidents have significant, and by the way, I can, as a former CTO of a public company, I can assure you that this is a very expensive, distracting, and inevitably a burdensome albatross for all of your organizations. Incidents are significant in this way too, but if we think about incidents as opportunities, if we think about incidents as messages, encoded messages that below the line is sending above the line, and your job is to decode them, if you think about incidents as things that actively try to get your attention to parts of the system that you thought you had a sufficient understanding of but you didn’t, these are reminders that you have to continually reconsider how confident you are about how it all works.
Now, if you take this view, a whole bunch of things open up. There’s an opportunity for new training, new tooling, new organizational structures, new funding dynamics and possibly insights that your competitors don’t have.
Incidents help us gauge the delta between how your system works and how we think your system works, and this delta is almost always greater than we imagine. I want to assert perhaps a different take that you might be used to, and it’s this. Incidents are unplanned investments in enterprise, in your company’s survival. They are hugely valuable opportunities to understand how your system works, what vulnerabilities in attention exist, and what competitive advantages you are not pursuing.
If you think about incidents, they burn money, time, reputation, staff, etc. These are unavoidable sunk costs. Something’s interesting about this type of investment, though. You don’t control the size of the investment, so therefore the question remains, how will you maximize the ROI on that investment?
When we look at incidents, these are the type of questions that we hear, and it’s quite consistent with what researchers find in other complex systems, domains. What’s it doing? Why is it doing that? What will it do next? How did it get into this state? What is happening? If we do Y, will it help us figure out what to do? Is it getting worse? It looks like it’s fixed, but is it? If we do X, will it prevent it from getting worse, or will it make it worse? Who else should we call that can help us? Is this our issue, or are we being attacked? This is consistent with many other fields. Aviation, air traffic control, especially in automation-rich domains.
Another thing that’s notable is that the beginning of any incident, it’s often uncertain or ambiguous about whether or not if this is the one that sinks us. At the beginning of an incident, we simply don’t know, especially if it contains huge amounts of uncertainty and huge amounts of ambiguity. If it’s uncertain and ambiguous, it means that we’ve exhausted our mental models. They don’t fit with what we’re seeing, and those questions arise. Only hindsight will tell us if that was the event that brought the company down or if it was a tough Tuesday afternoon.
Incidents provide calibration about how decisions are focused, about how attention is focused, about how coordination is focused, about how escalation is focused. The impact of time pressure, the impact of uncertainty, the impact of ambiguity, and the consequences of consequences. Research validates these opportunities.
“We should look deeply at incidents as, “non-routine challenging events, because these tough cases have the greatest potential for uncovering elements of expertise and related cognitive phenomena.”
– Gary Klein, the originator of naturalistic decision-making research.
There’s a family of well-worn methods, approaches, and techniques. Cognitive task analysis. Process tracing. Conversational analysis. The critical decision method. How we think postmortems have value looks a little bit like this:
An incident happens. Maybe somebody will put together a timeline. We have a little bit of a meeting. Maybe you’ve got a template, and you fill that out, and then somebody might make a report or not, and then you’ve got, yeah, action items, finally. We think that the greatest value, perhaps maybe the onliest value, is where you’re in a debriefing and people are walking through the timeline and you’re like, “Oh, my God. We know all this.”
This is not what the research bears out. The research bears out that if we gather subjective and objective data from multiple places, behavioral data, what people said, what people did, where they looked, what avenues in diagnosis did they follow and weren’t fruitful? Well-facilitated debriefings get people to contrast and compare their mental models that are necessarily flawed. You can produce different results, including things like bootcamp, onboarding materials, new hire training. You can have facilitation feedback if you build a program to train facilitators. You might make roadmap changes, really significant changes based on what you learn.
I can tell you this from some experience. There is nothing more insightful to a new engineer or an engineer just starting out in their career than being in a room with a veteran engineer who knows all of the nooks and crannies explaining things that they may not have ever said out loud. They have knowledge. They may draw pictures and diagrams that they’ve never drawn before because they think everybody else knows it. Guess what? They don’t. The greatest value is actually here because the quality of these outcomes depends on the quality of that, that recalibration. This is an opening to recalibrate mental models.
From the Stella Report, it “informs and recalibrates peoples’ models of the how the system works, their understandings of how it’s vulnerable and what opportunities are available for exploration.”
In a lot of the research, in all of the research contained in the Stella Report, and it fits with my experience at Etsy as well, one of the reflection’s strongest from people who do this in a facilitated way to do this comparing and contrasting. “I didn’t know it worked that way.” Then there’s always other, “How did it ever work?” Which is funny until you realize it’s serious. What that means is, the way not only I thought it worked a different way. Now, I cannot even imagine, I can’t even draw a picture in my mind of how it could have possibly worked. That should be more unsettling. By the way, I want to say this is not alignment. Like I said, via representations, we necessarily have incomplete mental models. The idea is not to have the same mental models, because they’re always incomplete, because things are always changing, and because they’re going to be flawed. We don’t want everybody to have the same mental model because then everybody’s got the same blind spots.
“Blameless” is table stakes. It’s necessary, but it’s not sufficient. You could build an environment, a culture, an embracing, a sort of welcoming organization that supports and allows people to tell stories in all of the messy details, sometimes embarrassing details, without fear of retribution, so that you could really make progress, and in understanding what’s happening, you can set that condition up and still not learn very much. It’s not sufficient. It’s necessary, but not sufficient. What I’m talking about is much more effort than typical post-incident reviews. Right? This is where an analyst, a facilitator can prep, collating, organizing, analyzing behavioral data. What people say, what people do. There’s a raft of data that they can sift through to prep for debriefings, a group debriefing, or a one-on-one debriefing, going beyond … Postmortems hint at the richness of incidents. Following up on this takes a lot of work.
By the way, everyone’s generally so exhausted after a really, a stressful outage or incident or event that sometimes everything becomes crystal clear. That’s the power of hindsight, and because it seems so crystal clear, doesn’t seem productive to have a debriefing, because you think you already know it all. The other issue is that postmortem briefings are constrained by time as well. You only have the conference room for an hour or two. Everybody is really busy, and the clock is ticking, so this is a challenge for doing this really well, even given those research methods.
The other issue, especially if you build a debriefing facilitation training program like I did at Etsy, there’s still challenges that show up. What I like to call it is, “Everyone has their own mystery to solve,” or, “Don’t waste my time on details I already know.” In a cartoonish way, you can think about it as this way:
Because you may only have an hour, you need to extract as much learning as you can. All work is contextual. Your job to maximize ROI is to discover, explore, and rebuild the context in which work is done in an incident, how work and how people thought above the line.
Assessments are trade-offs, and those are contextual.
In closing, all incidents can be worse. A superficial view is to ask, “What went wrong? How did it break? What do we fix?” These are very reasonable questions. If we were to take a deeper level, and we could ask, “What are the things that went into making it not nearly as bad as it could have been?” Because we don’t pay attention to those things and don’t identify those things, we might stop supporting those things.
Maybe the reason why it didn’t get worse is because somebody called Lisa, and Lisa knows her stuff. Something from research is that experts can see what is not there. If you don’t support Lisa, and you don’t even identify that the reason why it didn’t get worse is because Lisa was there. Forget about action items for fixing something for a moment. Imagine a world where Lisa goes to a new job.
Useful at a strategic level is a better question. “How can we support, encourage, advocate, and fund the continual process of understanding in our systems? And really take “above the line” in a sustained way?
Where do we go from here? I’ve got some challenges for you:
The increasing significance of our systems, the increasing potential for economic, political, and human damage when they don’t work properly, and the proliferation of dependencies and associated uncertainty all make me very worried. If you look at your own system and its problems, I think you’ll agree that we have to do a lot more than acknowledge this problem. We have to embrace it. What you can help me with, please spread this information, these ideas and my presentation from DevOps Enterprise Summit San Francisco 2017.
John wants to hear from you. What resonated with you about this? What didn’t? What challenges do you face in your org along these lines? Go tell him – he’s on Twitter.
This is the tenth in a series of blogs about various enterprise DevOps topics. The series is co-authored by Sacha Labourey, CEO, CloudBees and Nigel Willie, DevOps practitioner.
In our previous article, we set out some principles on metrics and what and how to measure. In this blog, we’ll add a few more insights into metrics and their measurement.
As stated in our original article, we do not intend to define comprehensive metrics. We do want to take the opportunity to suggest a couple of areas we feel worthy of further consideration which are not regularly discussed. They are:
Moving to a new way of working and a new culture is difficult. Indeed, it is often far more challenging than the technical challenges. In our experience, in large enterprises, rebuilding the bridge between the business and IT is a critical step. Sometimes IT commences the transformation without sufficient engagement with the business. The product manager is critical, ensuring that the correct initiatives are delivered at the correct time. To measure the success of business engagement, we recommend you consider measuring the time between any deliverable being available for release into production and the actual production release. Please note we are aware of the release challenges of approvals and sign-offs – this is not an attempt to measure this step.
In simple terms, “lost opportunity time” is an attempt to understand whether technologists are currently working on the correct initiatives for the business. If content is being developed which is not required for production release as soon as it is complete, one could argue that the individuals concerned would be more profitably engaged working on other initiatives. Using this metric, we believe business unit and portfolio managers could identify potential areas where effort could be better optimized.
A real-world example: the lost opportunity time metric was triggered by a conversation Nigel had with a team at a former employer, a few years ago. Everybody had been challenged to be more Agile, and a lot of teams were trying to do the right thing. Nigel was chatting to a team who advised they had moved to sprints and had now completed four sprints. He then asked if they had all made it to production, only to be told, “None have yet, as we can’t find anyone in the business to sign off on the release.”
This is a common concept. The product team provides an indication of the effort involved in delivering each requirement in their backlog, by the same token business value points are a measure of the value to the product owner of each requirement. The rationale behind the two complementary measures is to enable a balance between effort and value to be achieved. This was adopted by the Agile community as an aide to backlog grooming. In our experience, you should not attempt to standardize the values of these points between product owners and teams; they have value within a team, but not between teams.
It is very easy to find yourself in a position where new, high priority deliverables are passed to the technologists regularly. It is less common for the business to volunteer that something that was a high priority has now assumed lower priority. The net result can be an increase in high priority items over time and the team spinning more and more plates. To an extent this can be inevitable, and increased throughput increases capacity, but it is a significant risk. Taking early steps to enable the enterprise to articulate value and demand can significantly enable the quality of decisions and agility.
A real-world example: a product team has been working through a backlog for some time. Each item on the backlog was provided with business value points by the product owner. We can show that 70% of the business value has been delivered from the backlog. From the effort points for the remaining requirements, we can see based on current cadence it will take six months to deliver the rest of the backlog. Further, no significant new requirements have been added to the backlog by the product owner for a couple of months.
Based on this information, technology management can have a conversation with the business about the likelihood that this product is moved toward BAU soon and members of the team potentially moved to new initiatives on the business backlog.
In short, using metrics of this type can enable the enterprise to better identify the point at which products move from active to BAU or, where resources are constrained, which current initiatives could be quiesced to enable new priorities to commence.
In a large enterprise, particularly prior to a DevOps transformation, the technologists often end up under-represented at a senior level within a company. Timeliness and budget are always discussed at a senior level, technical health less regularly. One initiative to encourage a level of discussion on technical health is to incorporate this as a core measure on any executive dashboards. RAG status dashboards (red, amber or green) are common in large companies. Project or programme managers will provide a RAG status for timeliness and one for budget adherence.
We recommend an additional RAG signifier to those that are traditionally discussed. Each product should have a technical lead who is accountable for the technical health of the product. They will have access to metrics that indicate the quality, and quality trend, of the product. There are several products available that provide this type of metric. The information about defects, architectural consistency, maintainability, etc. is not something that can be easily discussed at a senior level. A more useful approach is for the technical lead to use their judgment to provide a RAG status for product quality. This can sit alongside the other two indicators in discussions.
The relationship between the product owner and the technical lead is critical. The technical lead should be in a position where they feel comfortable recommending remedial or evergreening work on the product to the product owner, rather than pure business value delivery. The product owner should be confident enough in the technical lead to represent the need to carry out this work to the business stakeholders.
Real-world example: The quality RAG status is something Nigel has experienced, and has seen it reduce pressure on technologists to just deliver, regardless of the degradation in technical quality.
Remember the act of measuring something amends behavior. You should be aware of unexpected impacts on behavior of any metric collection. Using the examples above, you could see:
Follow the Enterprise DevOps blog series from Sacha and Nigel:
This is the ninth in a series of blogs about various enterprise DevOps topics. The series is co-authored by Nigel Willie, DevOps practitioner and Sacha Labourey, CEO, CloudBees.
We shall start this blog by stating that we are firm supporters of the idea that the key metrics of IT performance are:
We also recognize that when running a program across a large enterprise, you will be subject to requests for other metrics from your key sponsors. These may include: Speed of adoption, impact of changes, cost of the program, any associated savings, etc. The key metrics are outcome metrics; however, there are also a set of lower-level metrics that drive the outcomes.
In this article, we are going to start by trying to define a set of principles around metrics. We shall not attempt to define a comprehensive set of lower-level metrics, as many of these are specific to circumstance or organization. Rather, we trust that the principles will act as a guide to enable you to understand what, and how, you should obtain meaningful metrics.
Principles of metrics:
1. Only collect actionable metrics – Ask yourself what you will do if your proposed metric changes? If you can’t articulate an action, you probably don’t need to collect the information.
2. Focus on the value to the business, not the level of work to IT – The fundamental rationale of DevOps is increasing business value. Your metrics should reflect this. Please note IT performance measurements are valid if they impact business value and meet condition one above.
3. Collect a few simple-to-understand metrics – Don’t turn metrics collection into an industry.
4. Metrics should be role-based – Business value metrics to the business, program-based metrics to program leads, technical debt metrics to technicians – understand the target community and rationale and don’t inundate people with metrics. We like the description that metrics should be offered as a “smorgasbord, where consumers select those which are pertinent to them for their personal dashboard.”
5. All metrics collection and collation should be automated – This should be self-evident to the DevOps practitioner, who is looking to automate delivery of technical capability. First, manual collection of metrics is time-consuming and counter to the program you are delivering. Second, you cannot obtain real-time information manually; as a result, you trend away from a proactive physical and, instead, toward a post-mortem.
6. All metrics should be displayed on a unified dashboard – Don’t expect people to hunt for them. This unified dashboard can be personalized for the consumer and the team, as per point four. The key consideration is that the customer should be able to find all the metrics they want in one place.
7. Prioritize raw numbers over ratios – With the exception of change fail rate which is correctly a ratio, we recommend the collection and display of raw data. This is particularly pertinent for metrics aimed at technicians. This both promotes the use of a holistic application of these numbers by the technical specialists within the team and reduces the risk of team-level metrics being used to compare performance across teams with no contextual understanding.
Because this is an important point, we are going to further explain. In most large enterprises, there is an annual discussion around levels of reward and bonus. Anything that is collected as a ratio can be grasped at by enterprise management to argue for their team. Nigel has been involved in many of these meetings over the years (far too many) and managers always fight for their teams to be highly rewarded when the organization starts fitting teams across a bell curve of performance. With ratios, you end up with conversations that go like this, “My team has a ratio of 75% and your team’s ratio is 71%, which supports my argument for a higher reward.” 90% of these metrics are meaningless at a comparative level and, by using raw numbers that the team itself can use to amend their behavior, you meet the primary need and reduce the risk of numbers being taken out of context and used for meaningless comparison. Of course, all meetings leak like sieves and the teams soon get to hear, rightly or wrongly, that their bonus was affected by, for example, the ratio of new lines of code compared to amended lines. They then amend their behavior to impact their bonus, not the organization’s needs.
8. Use the correct metric for the use case – Different use cases demand different metrics – for some products, it is velocity; for others, it is stability and availability. This principle is for the consumer of the metrics rather than the supplier, but it is critical. Our blog on context should make it clear that the primary success indicator for each product may not be the same.
9. Focus on team, not individual metrics – DevOps looks to drive a culture of cooperation and teamwork to deliver success. As your culture starts to change, you should see a greater focus on the recognition of teams rather than individuals. To support this, your metrics should focus on this too.
10. Don’t compare teams, compare trends – If we accept point 8, that different teams have different primary metrics, we should also accept that each team will have different goals. Additionally, if raw data is used for many metrics it makes little sense to compare teams. Rather, the product teams, business units, and key sponsors should compare trends within their teams and units.
11. Look for outliers – While avoiding direct comparisons between teams, it is still sensible to look for outliers. If these are identified, you should look for clues as to why certain teams are either significantly over or underperforming their peers. These can often provide significant learning points that add value to others.
12. Lead time is time to production, not time to completion – This is a fundamental principle. It is repeated here, as from our experience initial stages of adoption are often accompanied by a focus on reducing the time to production readiness. The last step of continuous delivery is often a follower, and it is critical that lead time measures time to production and nothing else. You should also be wary of soft launches being adopted if the formal production release to market is not a close follower.
13. Use secondary metrics to mitigate unintended consequences – For example, a focus on time to market could negatively impact quality, which is why the key metrics of IT performance contain both. If you focus on a specific metric, you should ask what the negative impacts of that focus could be and monitor trends in this space. This applies even if you have taken the conscious decision that you are happy to suffer these consequences.
In our next blog, 4 Further Considerations for Metrics Measurement, we will take the opportunity to suggest a couple of areas we feel worthy of further consideration which are not regularly discussed in organizations.
Follow the series from Sacha and Nigel:
Recently, Gene Kim had a chance to speak with Mirco Hering about Mirco’s new book, DevOps for the Modern Enterprise.
Gene Kim: What inspired you to write DevOps for the Modern Enterprise?
Mirco Hering: To be honest, there are a number of DevOps books in the market and I was wondering whether I had something unique to say. But then I realized that not a lot has been written of the complex organizational environments that people find themselves in with legacy applications and multiple vendors working along with people from their own organization. I wanted to address the challenge of a management mindset that was finely tuned for a different world where IT was considered to be predictable. I wanted to share what I have learned about those environments, and what I’ve learned from many failures and near-misses as well as my successes. My goal was to help with some simple activities that everyone can do in their own organization and to help the reader set themselves up for a successful transformation into a modern enterprise. I hope to be able to reach more people with this book than I would ever be able to personally work with, and help them on their journey to a better more positively exciting workplace.
GK: You describe vividly the incredibly complex ecosystems that enterprises and their consulting/system-integrator partners operate in-what would you want everyone in both parties to know that would help them?
MH: First that, in my experience, both sides genuinely want to find win/win scenarios and do the right thing. Engagements and relationships were created with the right intentions originally. Over time the environment and capabilities changed, but we did not keep up with the contracts and incentive models used. We still have the previous mindset and knowledge of what good contracts look like. It takes courage to question those and change the rules of engagement for the DevOps and Agile way of working, but it is worth doing as both sides stand to win from it. Find people on the other side that have the same interest and start talking directly (and, ideally, face to face) and you will be able to shift the engagement over time one step at a time. The game of telephone that is sometimes introduced by having several intermediaries is distracting from the real goal. Find the right person in the other organization that has the same interest and initially find creative ways to make it work with sub-optimal contracts in place. You don’t have to throw existing contracts out and start from scratch. Make sure you are on the same page and find small creative increments of change. It’s amazing what can be done when people on both sides are on the same page and push in the same direction.
GK: If your readers take away just one thing from your book, what do you hope that would be?
MH: I hope they find at least one exercise that they will run in their organization, and learn something useful for their transformation from the book. The one thought I want them to take away is that you don’t need complex frameworks to drive your transformation, but rather that you need the right mindset and principles to drive it, everything else is additional help that you can tailor to your needs. Implement a strong continuous improvement process based on the scientific method to drive your transformation. And remember that the mindset that made IT successful in the past is not the mindset that we need now and will need in the future. Okay, I guess that was two things, but I think they go hand-in-hand.
GK: You state that “cultural change that is the hardest is also the most impactful.” For something that feels so abstract, what one concrete change can your readers implement today to help them down the path of a broader cultural change?
MH: A truly open-minded and blameless continuous improvement process is a very good central change engine for cultural change. It should span across organizational boundaries. Once you are in a room where people from all across your organization and from your vendors collaborate to find ways to improve, you will know that the culture has shifted. I found that collaboration is easy when it is just one on one, but that culture manifests itself when larger groups come together. The before mentioned rigorous experimentation and evaluation process to assess progress at the core of continuous improvement takes the personal agendas off the table. This process can be aimed at all aspects of IT delivery, your overall business processes and your engagements with stakeholders. As soon as this process is somehow constrained or involves finding blame or excuses, it loses many of its benefits and negative culture will sneak back in. Once results improve through the right behaviors, your culture will start to shift too. You cannot change culture by itself, in my opinion; you have to achieve results with a new way of working which will, in turn, shift the culture.
GK: What is the biggest challenge(s) facing legacy IT organizations today?
MH: The biggest challenge for legacy IT organizations is that the complexity of their IT environments is shaping their worldview. They see it as a reason for why change is not possible. While it is true that it is harder, it is absolutely possible. The problem is that looking at “unicorns” and trying to emulate them is not going to make you successful. You cannot just use the “Spotify-model” or adopt the “Netflix-culture.” You have to do the hard yards of adapting your own organization in your ecosystem to the new world. The good news is that it is fun once you start understanding what success feels like, and I have seen many organizations starting to make that shift.
GK: In your vision, what will the modern IT organization look like in ten years? What will be their biggest challenge(s) and their biggest strengths?
MH: Looking 10 years in the future basically sets me up for an embarrassing follow-up interview, so let me make this follow up entertaining by pushing the boundary. The modern IT organization in 10 years will use a mix of technologies (cloud, on-premise, custom build and packaged) similar to many today, but has detangled them to evolve at the appropriate speeds independently from each other. Business stakeholders work in end-to-end teams that operate and change the systems which run the business and leverage commodity services from in-house and external providers for most of the non-differentiating aspects of their IT landscape. The challenge will have moved from a problem of scarcity (too slow, too expensive) to one of plenty (lots of change, fast-moving). We will write blog posts about the need to not react to every data point or trend and to work on “out of the box” innovation that requires us to stay a course for a while even when data and customers suggest we need “faster horses” to develop the equivalent of “cars.” IT challenges will not be technology issues, they will be business challenges of identifying what is providing value and what is unnecessary and can be cut back. Data aggregation and synthesis will be at the core of IT as the remaining constraint and artificial intelligence is employed as a partner in our governance processes.
As digital technology continues to disrupt and transform businesses across industries and around the world, the ability to rapidly deliver high-quality software will make the difference between survival and extinction for many companies. Ultimately, successful adoption of DevOps and the process of continuous delivery are likely to determine whether an organization thrives or fails in the digital age.
DevOps is a practice that gives organizations the ability to develop and deploy software faster and more efficiently, enabled by automated processes such as continuous delivery. DevOps focuses on cultural transformation and making it easier for development and operations teams to collaborate and achieve shared objectives. Continuous delivery is a process that lets developers continuously roll out tested code that is always in a production-ready state. With continuous delivery, application development teams use automation to deliver updates faster and with fewer errors. Once a new feature or update is complete, the code is immediately available for deployment to test environments, pre-staging or live production.
Together, cultural change and process automation accelerate the creation and delivery of high-quality software, saving many organization millions of dollars and hundreds of thousands of developer hours every year that can now be spent on innovation and not on administration.
A CloudBees assessment of more than 100 companies that have adopted DevOps found that continuous delivery processes helped them save an average of $3,500 and 66 hours per developer, per year. The CloudBees analysis, which is based on a conservative cost of $53 per hour for a developer, includes companies that represent a variety of sizes, industries, and regions. These organizations employed an average of 1,530 developers across 11 teams, which translates into annual savings of $5,355,000 and 100,980 hours, a gain of 12,622 development days per year. For larger organizations, the gains are even more impressive. A company with 10,000 developers, for example, could save $35 million and 660,000 hours annually, for a gain of 82,500 developer days every year. What’s the true value of these cost and time savings? Returning hours back to developers, to focus on innovation that keeps a company ahead of the competition, retains existing customers and attracts new ones.
The impact is compelling across industries, company sizes, and geographic regions. For example, Capital One has increased deployment frequency by 1,300%. “With the CloudBees Jenkins Platform we’ve created a service for our developers that’s scalable and stable,” says Brock Beatty, director, software engineering, Capital One. “As a result, the time they would’ve spent managing infrastructure is now spent developing business applications. That has contributed to our ability to increase deployments from a couple per year to now deploying every two weeks.”
The value is visible for small-to-medium sized organizations as well. “In thinking about how we can deliver better products faster for our customers, we adopted a service-oriented mindset – breaking systems and workflows into smaller pieces that can be delivered or executed quickly with an automated pipeline,” says Jack Waters, senior vice president of engineering at WatchGuard, which provides enterprise-grade network security appliances and wireless security hardware. “We’re now able to get big things done noticeably faster. Whether it’s implementing encryption or changing the way our back-end databases are set up. Activities that would take months are now taking days to weeks to complete.”
The value of adopting DevOps and continuous delivery is not just about improving the software development process; the ultimate advantage is the array of business benefits it delivers.
DevOps and continuous delivery allow organizations to drive innovation that sharpens their competitive edge while reducing costs, increasing revenue and ensuring faster time to market. Using DevOps and continuous delivery, companies can improve collaboration and productivity, while reducing risk. They’re also able to strengthen brand equity, improve customer service and satisfaction, and create a working environment that makes it easier to attract and retain top talent.
Hurwitz & Associates recently completed a CloudBees-sponsored study of 150 top IT decision makers, 77 percent of whom reported either company-wide or business unit implementation of continuous delivery. When asked how implementing continuous delivery had affected their business, 81 percent said continuous delivery is helping their organization bring value to customers and deliver on business goals. Approximately 44 percent reported significant improvements in their organization’s ability to provide customer value and meet business objectives. Equally important, none of the participants reported a decline in meeting business goals after implementing continuous delivery.
As accelerating innovation and responding rapidly to shifting market and customer demands becomes more critical for the success of every organization, a growing number of business leaders are recognizing IT as a strategic asset. Being able to shorten application delivery times, improve software quality and quickly adapt to change while dealing effectively with security, availability, and compliance is just the type of competitive advantage organizations are seeking.
“We can’t afford to be complacent about what we have already offered; it’s about what else we have to offer our customers,” says Waters. “Having a healthy continuous delivery pipeline has really helped us stay competitive, maintain the level of quality that WatchGuard is known for and deliver new products with a high degree of confidence.”
DevOps, like Agile, has transformed enterprise software delivery. Thanks to sprints, prioritization, CI/CD, and release automation, organizations are building and deploying software products faster than ever. That pesky bottleneck between code commit and deploy has been all but eliminated, which should ensure better time to value for customers.
Yet if your flow time – i.e. end-to-end lead time – is still too long, unpredictable, and unmeasurable, it’s likely you’ve only shifted the bottleneck further upstream. Sure, automation has sped up handoffs and communication between developers and operations, but what about everything else that happens in the process?
What about all the other manual processes that take place before and after a piece of code is written? If there are still manual handoffs at key stages of the process, then your overall workflow is still being impeded by bottlenecks outside of the DevOps stage.
As Dominica DeGrandis, our Director of Digital Transformation, explains in her latest article for TechBeacon, you can only identify and remove these bottlenecks if you can see them. A LOT happens before “Dev” and after “Ops.” A lot of creative thinking and activity ensures the right product is built, maintained and delivering value to the end user. And unless you can trace and automate the flow of work from ideation to production, you won’t be able to optimize the process. You need to collect and consolidate all data that pertains to planning, building, and delivery of the product.
So how do you avoid bottlenecks and accelerate your DevOps (and other IT) transformations? First, you need to ask some important questions:
If the answer is “no” or you’re not sure, then it’s likely your software delivery value stream is still a mysterious black box of activity, and not optimized as a result. With no visibility into the end-to-end process, how do you know where to look for bottlenecks? How do you where the opportunities are to create more value?
For a deeper look into how to find and remove bottlenecks, check out Dominica’s piece Break through those DevOps bottlenecks.
Download this short e-book to learn why your Agile and DevOps initiatives are struggling to scale
For a more dynamic discussion, request a personalized demo of your software delivery value stream. We can help you connect your value stream network, spot bottlenecks, and dramatically improve how fast and well you deliver innovative software products.
Over the years, I’ve seen technology trends come and go. Some to change our lives forever, some to disappear without a trace. However, in the most successful cases, the technology has become mainstream thanks to refined processes for its use, and a tidal wave of enthusiastic, skilled practitioners. No-one is scared of their mobile interfaces anymore, and Smart TVs are easier to operate.
It was only very recently the last year or so that I learned about COBOL technology. COBOL, the Common Business-Oriented Language, was developed nearly 60 years ago by a small group of computer professionals called the Conference on Data Systems Languages (CODASYL) that included Grace Hopper, the “mother of COBOL“, and Jean Sammet. The main objective of the committee was to develop a standard business-oriented language. COBOL was ground-breaking because it could run on more than one manufacturer’s computer. Because of its ease of use and portability, and because the US Department of Defense required COBOL on all its computer system purchases, COBOL quickly became one of the most used business programming languages in the world.
Interestingly, as I looked at the phenomenal success and continued pervasiveness of COBOL it became clear that it was founded on strong technology, clearly, but also that the processes built to leverage it were simple and clear, focused on business outcomes, modelling business processes; and the people side was taken care of because COBOL was easy to learn – its English-like syntax was designed with ease-of-use in mind. Little wonder it’s one of the few genuine constants in the IT world.
Looking across the three scales of people, processes, and technology, the facts made it clear. I’ll start with technology – after all, that’s what COBOL is.
From what I read initially, I heard People often complain or worry about businesses, banks, and government agencies using and the pejorative “legacy” code label.
While the concept of COBOL is from 1959, today’s incarnation of COBOL is modern and far from what I (or anyone else) should consider legacy. It possesses the same contemporary look and feel of any other language, works within standard development IDEs like Visual Studio and Eclipse, supports the latest releases of enterprise technology such as Cloud, Mobile, Managed Code, can live inside a Virtualised or Containerized environment, can support web services, object orientation and an API model. In short, whatever you are doing today, your COBOL systems will support it.
Moreover, COBOL still does all the things it’s known for, precise and rapid calculations, managing massive amounts of data, precision and accuracy and reliability that the enterprise systems it powers rely on. In 2014, the Wall Street Journal reported that each day 80 percent of the world’s business transactions rely on COBOL. COBOL is contemporary and is strongly entrenched in businesses around the world.
Mike Madden, development service manager with the British women’s clothing firm JD Williams, was asked why they still use COBOL on their mainframes. His reply: “Simple – we haven’t found anything faster than COBOL for batch-processing,” Madden says. “We use other languages, such as Java, for customer-facing websites, but COBOL is for order processing. The code matches business logic, unlike other languages.” According to IBM’s Charles Chu, “…there are 250bn lines of COBOL code working well worldwide. Why would companies replace systems that are working well?”
Unsurprisingly the well-known Computerworld survey from a few years back found that 64% of respondents stated that their organization or systems used COBOL, and 48% used it significantly. More recently a survey by Micro Focus revealed the top modernization priorities, which reported;
Which brings me on to the other two important criteria: processes and people.
The modern era of computing has witnessed a significant shift in how applications are being brought to market. The Agile era is upon us and statistics show anything up to 80%+ adoption of the modern “DevOps” process of building applications.
It begged the question; does that preclude old-school approaches like COBOL?
COBOL technology, in its modern incarnation, is a modern language and works alongside and within agile and DevOps development practices. Indeed, COBOL’s pervasiveness, ease-of-use, and integration with contemporary toolchains make it an ideal candidate for DevOps-style process improvement, whether on the mainframe or in a distributed environment.
Another major concern with COBOL is the fact that many of the programmers are aging and retiring. A Computerworld survey found that more than half of COBOL programmers in organizations surveyed were over the age of 45. Fast forward a few years and the 2017 mainframe survey by BMC talked about a resource pool half of whom were under the age of 50, suggesting a significant shift in the demographic of the COBOL IT shop.
In 2013 Micro Focus collected research on the lack of COBOL courses being offered in academia. Shockingly, despite the fact that 71% of university respondents believing that today’s businesses will continue to rely on COBOL code and applications for the next 10+ years, only 27% offer IT courses that include COBOL programming as part of their curriculum. More recently, evidence suggests a change of attitude. IBM has been working to keep mainframe skills alive in younger generations with its “Master the Mainframe” contest, as well as formal degree programs. Micro Focus has furnished a group of over 300 academic partners that teach students COBOL with contemporary COBOL technologies. In fact, Micro Focus taught a class at the most recent SHARE Academy and also spoke on the topic of skills at the SHARE Sacramento event. In his recent blog post Derek Britton, Director of Strategy and Enablement at Micro Focus, asserted that, “a combination of attitude, motivation, and support… technology is the key to overcoming any skills concern.”
In my short time learning about the COBOL world I am astonished more isn’t said about its power and prevalence in the IT world. If the success of a technical innovation can be measured across three axes of people, process, and technology, then the scores for COBOL in 2018 remain very impressive. Micro Focus remains committed to COBOL, its pervasiveness should be no surprise to any of us.
If any of this information about COBOL is news to you, I urge you to learn more.
This is the seventh in a series of blogs about enterprise DevOps. The series is co-authored by Nigel Willie, DevOps practitioner, and Sacha Labourey, CEO, CloudBees.
Hopefully, most of you have read The Phoenix Project. If not, we highly recommend it. (It is actually required reading for all new CloudBees employees.) Those of you who have done so will be aware that one of the critical issues experienced by the author was a critical dependence on Brent, the key technologist, who solved everybody’s problems. It becomes clear in the book that Brent, as competent as he is, is a logjam as everything passes through that one individual.
Any centralized team introduces exactly this risk. It is one of the fundamental objections that many people raise whenever the idea of a central DevOps team or function is raised. We acknowledge this is a valid concern. Our previous article on creating a service line advises some potential approaches.
Regardless of the structure, or structures, you choose it is critical that you avoid becoming a bottleneck to progress. In a large enterprise, this can be very difficult, as DevOps initiatives tend to be high profile, flagship programmes. As a result, senior stakeholders are extremely keen to demonstrate rapid progress. This, of course, is a significantly better problem to face than customer apathy! Something it sometimes pays to remind yourself as you try to keep many plates spinning.
Many vendors now deliver capabilities as services. For example, CloudBees Jenkins Enterprise provides a delivery framework that couples a level of central control and consistency with customer self-service. It also decomposes the previous architecture based around larger, shared masters to a more flexible approach based around individual team- or project-specific masters, spun up in minutes. All this supports a self-service approach to be pursued in a larger enterprise.
Whether using external or internal capabilities, we recommend that you prioritize customer self-service as a core competence. There are fundamentals that need to be completed to achieve this; exposure of services via API’s, defined patterns that integrate into an end-to-end flow, templated pipelines with drag and drop capabilities, etc.
Today, many larger enterprises follow a matrixed organizational structure. We do not intend to discuss the advantages of various organizational structures in these articles, we would, however, pass on specific advice for anyone working in an organization of this type. It is a given that any service will need maintenance and upgrades. These potentially impose a short service outage. From experience, if a service is shared across business streams within the organization, agreeing on an outage can involve significant negotiation. It is human nature to believe your personal priority has primacy. In matrixed organizations, cross-business stream priority requires consensus. When architecting any service, in addition to the usual considerations (availability, scalability, service latency, etc.) it is useful to consider this factor when allocating consumers to servers, masters, availability zones, LPAR’s, etc.
In short, key considerations when providing a centralized capability should be:
The role of any central IT team is to enable its customers to deliver rapidly via fully supported, consistent automation capabilities. It is not acceptable to become an impediment to progress to the entire enterprise by imposing dependencies on one group of individuals.