How to Recover a SQL Server Database That Is Corrupt


In this article, we will see how to solve the problem to restore a SQL Server Database that is corrupt. To solve the problem, we will use a special software.

The software name is Kernel for SQL Database Recovery and you can download the software here:

Original Link

Classification From Scratch, Part 3 of 8: Logistics With Kernels

This is the third post of our series on classification from scratch, following the previous post that introduced smoothing techniques, with (b)-splines. Consider here kernel-based techniques. Note that, here, we do not use the “logistic” model… it is purely non-parametric.

Kernel-Based Estimated, From Scratch

I like kernels because they are somehow very intuitive. With GLMs, the goal is to estimate

Image title

Heuritically, we want to compute the (conditional) expected value on the neighborhood of X. If we consider some spatial model, where X is the location, we want the expected value of some variable Y, “in the neighborhood” of X. A natural approach is to use some administrative region (county, department, region, etc). This means that we have a partition of X (the space with the variable(s) lies). This will yield the regressogram, introduced in Tukey (1961). For convenience, assume some interval/rectangle/box type of partition. In the univariate case, consider

Image title

or the moving regressogram

Image title

In that case, the neighborhood is defined as the interval (x ± h). That’s nice, but clearly very simplistic. If Xi= X and Xj=X-h+E (E>0), both observations are used to compute the conditional expected value. But if Xj=X-h-E, only Xi is considered. Even if the distance between Xj and Xj’ is extremely, extremely small. Thus, a natural idea is to use weights that are a function of the distance between Xi’s and X. Use

Image title

where (classically)

Image title

for some kernel k (a non-negative function that integrates to one) and some bandwidth h. Usually, kernels are denoted with a capital letter K, but I prefer to use k, because it can be interpreted as the density of some random noise we add to all observations (independently).

Actually, one can derive that estimate by using kernel-based estimators of densities. Recall that:

Image title

Now, use the fact that the expected value can be defined as:

Image title

Consider now a bivariate (product) kernel to estimate the joint density. The numerator is estimated by:

Image title

While the denominator is estimated by:

Image title

In a general setting, we still use product kernels between Y and X and write

Image title

for some symmetric positive definite bandwidth matrix

Image title

Now that we know what kernel estimates are, let us use them. For instance, assume that k is the density of the N(0,1) distribution. At point x, with a bandwidth h we get the following code:

mean_x = function(x,bw){ w = dnorm((myocarde$INSYS-x)/bw, mean=0,sd=1) weighted.mean(myocarde$PRONO,w)}
u = seq(5,55,length=201)
v = Vectorize(function(x) mean_x(x,3))(u)

and of course, we can change the bandwidth.

v = Vectorize(function(x) mean_x(x,2))(u)

We observe what we can read in any textbook: with a smaller bandwidth, we get more variance, less bias. “More variance,” here, means more variability (since the neighborhood is smaller, there are fewer points to compute the average, and the estimate is more volatile), and “less bias” in the sense that the expected value is supposed to be computed at point x, so the smaller the neighborhood, the better.

Using the ksmooth R Function

Actually, there is a function in R to compute this kernel regression.

reg = ksmooth(myocarde$INSYS,myocarde$PRONO,"normal",bandwidth = 2*exp(1))

We can replicate our previous estimate. Nevertheless, the output is not a function, but two series of vectors. That’s nice to get a graph, but that’s all we get. Furthermore, as we can see, the bandwidth is not exactly the same as the one we used before. I did not find any information online, so I tried to replicate the function we wrote before:

reg = ksmooth(myocarde$INSYS,myocarde$PRONO,"normal",bandwidth = bk)
f=function(bm){ v = Vectorize(function(x) mean_x(x,bm))(reg$x) z=reg$y-v sum((z[!])^2)}

There is a slope of 0.37, which is actually e-1. Coincidence? I don’t know to be honest…

Application in a Higher Dimension

Consider now our bivariate dataset, and consider some product of univariate (Gaussian) kernels.

u = seq(0,1,length=101)
p = function(x,y){ bw1 = .2; bw2 = .2 w = dnorm((df$x1-x)/bw1, mean=0,sd=1)* dnorm((df$x2-y)/bw2, mean=0,sd=1) weighted.mean(df$y=="1",w)
v = outer(u,u,Vectorize(p))
contour(u,u,v,levels = .5,add=TRUE)

We get the following prediction:

Here, the different colors are probabilities.

k-nearest Neighbors

An alternative is to consider a neighborhood not defined using a distance to point X but the k-neighbors, with the n observations we got.

Image title


Image title


Image title

The difficult part here is that we need a valid distance. If units are very different on each component, using the Euclidean distance will be meaningless. So, quite naturally, let us consider here the Mahalanobis distance.

Sigma = var(myocarde[,1:7])
Sigma_Inv = solve(Sigma)
d2_mahalanobis = function(x,y,Sinv){as.numeric(x-y)%*%Sinv%*%t(x-y)}
k_closest = function(i,k){ vect_dist = function(j) d2_mahalanobis(myocarde[i,1:7],myocarde[j,1:7],Sigma_Inv)
vect = Vectorize(vect_dist)((1:nrow(myocarde))) which((rank(vect)))}

Here we have a function to find the k closest neighbor for some observation. Then two things can be done to get a prediction. The goal is to predict a class, so we can think of using a majority rule: the prediction for Yi is the same as the one the majority of the neighbors.

k_majority = function(k){ Y=rep(NA,nrow(myocarde)) for(i in 1:length(Y)) Y[i] = sort(myocarde$PRONO[k_closest(i,k)])[(k+1)/2] return(Y)}

But we can also compute the proportion of black points among the closest neighbors. It can actually be interpreted as the probability to be black (that’s actually what was said at the beginning of this post, with kernels).

k_mean = function(k){ Y=rep(NA,nrow(myocarde)) for(i in 1:length(Y)) Y[i] = mean(myocarde$PRONO[k_closest(i,k)]) return(Y)}

We can see on our dataset the observation, the prediction based on the majority rule, and the proportion of dead individuals among the 7 closest neighbors.

MAJORITY=k_majority(7),PROPORTION=k_mean(7)) OBSERVED MAJORITY PROPORTION [1,] 1 1 0.7142857 [2,] 0 1 0.5714286 [3,] 0 0 0.1428571 [4,] 1 1 0.5714286 [5,] 0 1 0.7142857 [6,] 0 0 0.2857143 [7,] 1 1 0.7142857 [8,] 1 0 0.4285714 [9,] 1 1 0.7142857
[10,] 1 1 0.8571429
[11,] 1 1 1.0000000
[12,] 1 1 1.0000000

Here, we got a prediction for an observed point, located at Xi, but, actually, it is possible to seek the k closest neighbors of any point X. Back on our univariate example (to get a graph), we have:

mean_x = function(x,k=9){ w = rank(abs(myocarde$INSYS-x),ties.method ="random") mean(myocarde$PRONO[which(w<=9)])}
v=Vectorize(function(x) mean_x(x,3))(u)

That’s not very smooth, but we do not have a lot of points either.

If we use that technique on our two-dimensional dataset, we obtain the following:

Sigma_Inv = solve(var(df[,c("x1","x2")]))
u = seq(0,1,length=51)
p = function(x,y){ k = 6 vect_dist = function(j) d2_mahalanobis(c(x,y),df[j,c("x1","x2")],Sigma_Inv) vect = Vectorize(vect_dist)(1:nrow(df)) idx = which(rank(vect)<=k) return(mean((df$y==1)[idx]))}
v = outer(u,u,Vectorize(p))
image(u,u,v,xlab="Variable 1",ylab="Variable 2",col=clr10,breaks=(0:10)/10)
contour(u,u,v,levels = .5,add=TRUE)

This is the idea of local inference, using either kernel on a neighborhood of X or simply using the k nearest neighbors. Next time, we will investigate penalized logistic regressions.

Original Link

Matthew Garrett Calls on Symantec to Share Its Code, EFF Questions Google’s Work on Project Maven and More

News briefs for April 6, 2018.

Linux kernel developer, free software activist and Google engineer Matthew Garrett discovered that Symantec is using a Linux distro based on the QCA Software Development Kit (QSDK) project: “This is a GPLv2-licensed, open-source platform built around the Linux-based OpenWrt Wi-Fi router operating system” (if true, this means Symantic needs to share the Norton Core Router’s code). So, Garrett tweeted “Hi @NortonOnline the Norton Core is clearly running Linux and the license requires you to distribute the kernel source code so where can I get it?” (Source: ZDNet.)

The EFF has questions and advice for Google regarding the company’s work on “Project Maven”, which is “a U.S. Department of Defense (DoD) initiative to deploy machine learning for military purposes”. Read the “Google Should Not Help the U.S. Military Build Unaccountable AI Systems” post by Peter Eckersley and Cindy Cohn for more information.

Ubuntu 18.04 LTS (Bionic Beaver) final beta was released this morning. This release includes Ubuntu 18.04 LTS Desktop, Server and Cloudproducts, as well as Kubuntu, Lubuntu, Ubuntu Budgie, UbuntuKylin, Ubuntu MATE, Ubuntu Studio, and Xubuntu. Note that this version is still beta and not intended for use in production. The final release is scheduled for April 26. See the release notes for more details and download images.

Zilliqa recently announced its Testnet v1.0 release: codename Red Prawn. According to the press release, Zilliqa’s is the “first blockchain platform to actually implement the technology of sharding, which has the potential to scale blockchain transaction speeds to match VISA.”

openSUSE’s Tumbleweed distro (a pure rolling-release version of openSUSE) had several snapshot releases this week, most notably with updates to KDE’s newest point version of Plasma (5.12.4). The snapshots this week also included updates to gstreamer, Firefox and Digikam, among other things.

Original Link

Subutai Blockchain Router v2.0, NixOS New Release, Slimbook Curve and More

News briefs for April 5, 2018.

Subutai recently announced that its Subutai Blockchain Router v2.0 is in production: “This broadband cloud router serves as a ‘plug-and-play’ cryptocurrency wallet and mining device with energy savings of 10x over traditional mining methods, and also allows users to share and rent their idle computer resources by registering their computers with the Subutai Bazaar.”

NixOS released version 18.03 “Impala” yesterday. Highlights include “core version changes: linux: 4.9 -> 4.14, glibc: 2.25 -> 2.26, gcc: 6 -> 7, systemd: 234 -> 237”; “desktop version changes: gnome: 3.24 -> 3.26, (KDE) plasma-desktop: 5.10 -> 5.12”; the Nix package manager now defaults to 2.0 and more.

Matthew Garrett wrote a blog post yesterday titled “Linux Kernel Lockdown and UEFI Secure Boot” to elaborate on the kernel lockdown feature being paired with UEFI SecureBoot, in response to discussion on the LKML.

The Slimbook Curve—a new cool-looking, all-in-one Linux PC with a 24″ full-HD curved screen—is now available from Spanish company Slimbook. See the OMG Ubuntu post for specs and pricing info.

LibreOffice 6.0.3 is available for download. This is the third minor release of LibreOffice 6, and it has about 70 bug and regression fixes. This version “represents the bleeding edge in terms of features and as such is targeted at early adopters, tech-savvy and power users, while LibreOffice 5.4.6—provided as an alternative download option—is targeted at mainstream users and enterprise deployments.”

Original Link

Mozilla Announces Firefox Reality Browser for Mixed Reality, GnuCash 3.0 New Release and More

Mozilla announced Firefox Reality today, “Bringing the Immersive Web to Mixed Reality Headsets”. Firefox Reality is the only open source browser for mixed reality and the first cross-platform browser for mixed reality. See The Mozilla Blog for more details.

GnuCash 3.0 was released yesterday, marking the first stable release in the 3.x series. This version has several new features, but the main update is the use of the Gtk+-3.0 Toolkit and the WebKit2Gtk API. See the announcement for a list of all the new features for both users and developers.

Kernel 4.17 will have 500,000 fewer lines of code, as maintainers have decided to deprecate support for old CPU architectures. As written in the pull request on the LKML, “This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers.”

Compete in the second annual Linux Game Jam! Submissions will be accepted starting April 5th and the deadline is April 14th. This year’s theme is “Versatile Verbs”. See the website for all the rules.

OpenSSH 7.7 was released this morning. This version is primarily a bugfix release.

And in other new releases, the OpenBSD team announced new version 6.3 yesterday. This update features SMP support on arm64 and multiple security improvements, including Meltdown/Spectre (variant 2) mitigations. See the release page for the complete list of changes.

Original Link

Linux 4.16 Released, SLES SP3 for Raspberry Pi, Cloudflare Launches the Privacy-First DNS Service and More

News briefs for April 2, 2018.

Linux 4.16 was released yesterday. Linus says “the take from final week of the 4.16 release looks a lot like rc7, in that about half of it is networking. If it wasn’t for that, it would all be very small and calm.”

SUSE recently released SLES SP3 for the Raspberry Pi, which includes full commercial support for enterprise users. The new version “targets the Raspberry Pi Model 3 B, although SUSE says it is planning support for the new Raspberry Pi Model 3 B+”. In addition, SUSE “developers have made the new image smaller—around 630MB—by trimming compilers and debugging tools while tuning the Arm OS for IoT tasks”. For more details, see the ZDNet article.

Cloudflare announced yesterday the launch of, the “the Internet’s fastest, privacy-first consumer DNS service”. Cloudflare is focused on privacy, and it has “committed to never writing the querying IP addresses to disk and wiping all logs within 24 hours”.

Arcan is working on developing Safespaces, “an open source VR desktop”, designed to run on the Arcan display server. See the Arcan blog for more information and a demo video. You can check out the code on GitHub.

Everspace, a 3D single-player space-shooter game, is officially coming to Linux soon. Rockfish games announced it’s planning to release a patch with bugfixes and improved joystick support in two to four weeks, adding “We also hope to announce the official Linux release, then!”

Original Link

diff -u: Speeding Up the Un-Speed-Up-able

Sometimes kernel developers can work in parallel for years without realizing it. It’s one of the inefficiencies of a distributed system that tends to work out as a benefit when you have enough contributors to be able to spare the extra labor—it’s sort of a “with enough eyeballs, all bugs are shallow” kind of thing.

This time Kirill A. Shutemov wanted to scratch a particular itch in the memory encryption code. Memory encryption occurs at such a heavily used place in the kernel that a difference of just a couple opcodes can make a noticeable speed difference to a running system. Anything short of introducing security holes would be worth considering, in order to shave off those few Assembly instructions.

But Kirill had a problem—his code made things worse. He wanted to convert the __PHYSICAL_MASK variable into what he called a “patchable constant”—a value that had the simplicity of a constant at runtime, but that could be set dynamically by configuration options or at the bootloader command line before booting the system.

So far, he had come up with a patch that did this—and that would be applicable to other variables that might be faster to implement as constants. Unfortunately, as implemented, although it achieved the feature he wanted, it caused GCC to produce code that was less efficient than what it had produced before. Instead of speeding things up, this resulted in a 0.2% slowdown on his test system.

He posted his patch, along with a request for ideas, to the linux-kernel mailing list.

Linus Torvalds replied with an analysis of the opcodes produced by GCC, showing why they were slower and why Kirill’s general approach always would suffer from the slowdown issue. In particular, Linus pointed out that Kirill moved constant values into registers, which never would be optimal, and his code also included a movabsq instruction, which he felt was rarely needed.

Linus said he actually really wanted this feature in spite of the complex overhead required to accomplish it; in fact, he preferred an even more complex approach, along the lines of something H. Peter Anvin had attempted about 18 months earlier. Of that approach, he said:

It’s rather more complex, but it actually gives a much bigger win. The code itself will be much better, and smaller. The *infrastructure* for the code gets pretty hairy, though.

Peter replied that he’d actually been working on this very project for some time from a secret, undisclosed location at the bottom of a well in a town with no name. In fact, his current approach was yet more ambitious (and complex) than what he’d tried 18 months earlier. On the other hand, as he put it, he was now about 85% done with the whole thing, and the code “mostly needs cleanup, testing, and breaking into reasonable chunks.” He added:

The main reason I haven’t submitted it yet is that I got a bit overly ambitious and wanted to implement a whole bunch of more complex subcases, such as 64-bit shifts on a 32-bit kernel. The win in that case is actually quite huge, but it requires data-dependent code patching and not just immediate patching, which requires augmentation of the alternatives framework.

So it looks like Kirill’s itch is about to be scratched…by Peter. An element of the kernel so deep and tangled that it seemed intended never to be touched is being tugged apart beyond its apparent natural limits. And, all of this is achieved by the mere addition of a big honking pile of insanely complex code that even seasoned developers balked at touching.

Note: If you’re mentioned above and want to post a response above the comment section, send a message with your response text to

Original Link

diff -u: Intel Design Flaw Fallout

For weeks, the world’s been talking about severe Intel design flaws affecting many CPUs and forcing operating systems to look for sometimes costly workarounds.

Linux patches for these issues are in a state of ongoing development. Security is always the first priority, at the expense of any other feature. Next would probably be the general speed of a running system for the average user. After that, the developers might begin piecing together any features that had been pulled as part of the initial security fix.

But while this effort goes on, the kernel developers seem fairly angry at Intel, especially when they feel that Intel is not doing enough to fix the problems in future processors.

In response to one set of patches, for example, Linus Torvalds burst out with, “All of this is pure garbage. Is Intel really planning on making this shit architectural? Has anybody talked to them and told them they are f*cking insane?” He went on, “the IBRS garbage implies that Intel is _not_ planning on doing the right thing for the indirect branch speculation. Honestly, that’s completely unacceptable.” And then he said:

The whole IBRS_ALL feature to me very clearly says “Intel is not serious about this, we’ll have an ugly hack that will be so expensive that we don’t want to enable it by default, because that would look bad in benchmarks”. So instead they try to push the garbage down to us. And they are doing it entirely wrong.

He went on, even more disturbingly, to say:

The patches do things like add the garbage MSR writes to the kernel entry/exit points. That’s insane. That says “we’re trying to protect the kernel”. We already have retpoline there, with less overhead.

So somebody isn’t telling the truth here. Somebody is pushing complete garbage for unclear reasons. Sorry for having to point that out….As it is, the patches are COMPLETE AND UTTER GARBAGE….WHAT THE F*CK IS GOING ON?

At one point, David Woodhouse offered a helpful technical summary of the whole situation for those of us on the edge of our seats:

This is all about Spectre variant 2, where the CPU can be tricked into mispredicting the target of an indirect branch. And I’m specifically looking at what we can do on *current* hardware, where we’re limited to the hacks they can manage to add in the microcode.

The new microcode from Intel and AMD adds three new features.

One new feature (IBPB) is a complete barrier for branch prediction. After frobbing this, no branch targets learned earlier are going to be used. It’s kind of expensive (order of magnitude ~4000 cycles).

The second (STIBP) protects a hyperthread sibling from following branch predictions which were learned on another sibling. You *might* want this when running unrelated processes in userspace, for example. Or different VM guests running on HT siblings.

The third feature (IBRS) is more complicated. It’s designed to be set when you enter a more privileged execution mode (i.e. the kernel). It prevents branch targets learned in a less-privileged execution mode, BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it’s not just a “set-and-forget” feature, it also has barrier-like semantics and needs to be set on *each* entry into the kernel (from userspace or a VM guest). It’s *also* expensive. And a vile hack, but for a while it was the only option we had.

Even with IBRS, the CPU cannot tell the difference between different userspace processes, and between different VM guests. So in addition to IBRS to protect the kernel, we need the full IBPB barrier on context switch and vmexit. And maybe STIBP while they’re running.

Then along came Paul with the cunning plan of “oh, indirect branches can be exploited? Screw it, let’s not have any of *those* then”, which is retpoline. And it’s a *lot* faster than frobbing IBRS on every entry into the kernel. It’s a massive performance win.

So now we *mostly* don’t need IBRS. We build with retpoline, use IBPB on context switches/vmexit (which is in the first part of this patch series before IBRS is added), and we’re safe. We even refactored the patch series to put retpoline first.

But wait, why did I say “mostly”? Well, not everyone has a retpoline compiler yet…but OK, screw them; they need to update.

Then there’s Skylake, and that generation of CPU cores. For complicated reasons they actually end up being vulnerable not just on indirect branches, but also on a “ret” in some circumstances (such as 16+ CALLs in a deep chain).

The IBRS solution, ugly though it is, did address that. Retpoline doesn’t. There are patches being floated to detect and prevent deep stacks, and deal with some of the other special cases that bite on SKL, but those are icky too. And in fact IBRS performance isn’t anywhere near as bad on this generation of CPUs as it is on earlier CPUs *anyway*, which makes it not quite so insane to *contemplate* using it as Intel proposed.

That’s why my initial idea, as implemented in this RFC patchset, was to stick with IBRS on Skylake, and use retpoline everywhere else. I’ll give you ‘garbage patches’, but they weren’t being ‘just mindlessly sent around’. If we’re going to drop IBRS support and accept the caveats, then let’s do it as a conscious decision having seen what it would look like, not just drop it quietly because poor Davey is too scared that Linus might shout at him again.

I have seen *hand-wavy* analyses of the Skylake thing that mean I’m not actually lying awake at night fretting about it, but nothing concrete that really says it’s OK.

If you view retpoline as a performance optimisation, which is how it first arrived, then it’s rather unconventional to say “well, it only opens a *little* bit of a security hole but it does go nice and fast so let’s do it”.

But fine, I’m content with ditching the use of IBRS to protect the kernel, and I’m not even surprised. There’s a *reason* we put it last in the series, as both the most contentious and most dispensable part. I’d be *happier* with a coherent analysis showing Skylake is still OK, but hey-ho, screw Skylake.

The early part of the series adds the new feature bits and detects when it can turn KPTI off on non-Meltdown-vulnerable Intel CPUs, and also supports the IBPB barrier that we need to make retpoline complete. That much I think we definitely *do* want. There have been a bunch of us working on this behind the scenes; one of us will probably post that bit in the next day or so.

I think we also want to expose IBRS to VM guests, even if we don’t use it ourselves. Because Windows guests (and RHEL guests; yay!) do use it.

If we can be done with the shouty part, I’d actually quite like to have a sensible discussion about when, if ever, we do IBPB on context switch (ptraceability and dumpable have both been suggested) and when, if ever, we set STIPB in userspace.

Most of the discussion on the mailing list focused on the technical issues surrounding finding actual solutions. But Linus was not alone in finding the situation unacceptable. A variety of developers, including David, were horribly offended, not by the design flaw itself, but by the way they perceived Intel to be handling the subsequent situation—the poor technical fixes, the lack of communication between Intel developers and the kernel community, and as Linus pointed out, the potential choice by Intel not to fix some of the problems at all.

Note: If you’re mentioned above and want to post a response above the comment section, send a message with your response text to

Original Link

Tails Security Update, Companies Team Up to Cure Open Source License Noncompliance, LG Expanding webOS and More

News briefs for March 19, 2018.

Tails 3.6.1 is out, and the release fixes many security holes in 3.6. Update now.

According to a Red Hat press release this morning: “six additional companies have joined efforts to promote greater predictability in open source licensing. These marquee technology companies—CA Technologies, Cisco, HPE, Microsoft, SAP, and SUSE—have committed to extending additional rights to cure open source license noncompliance. This will lead to greater cooperation with distributors of open source software to correct errors and increased participation in open source software development.”

LG announced today that “As part of a broader effort to make webOS even more accessible to today’s consumers and industries”, it is partnering with the National IT Industry Promotion Agency, a governing body within South Korea’s Ministry of Science and ICT, “to more actively advance its philosophy of open platform, open partnership and open connectivity”. As part of its global expansion goal, “LG developed an open source version of its platform, webOS Open Source Edition, now available to the public at

Stack Overflow’s 2018 Developer Survey results are in, and this year, 48.3% of developers said they use Linux as a platform over 35.4% using Windows. Linux also was voted “Most Loved Platform” at 76.5%. Other interesting results include Python winning for “Most Wanted Language” at 25.1%.

Linux 4.16-rc6 was released yesterday, the changes were minimal, and things are on track. Linus says “Go test, things are stable and there’s no reason to worry, but all the usual reasons to just do a quick build and verification that everything works for everybody. Ok?”

Original Link

It’s Here. The March 2018 Issue of Linux Journal Is Available for Download Now.

Boasting as many pages as most technical books, this month’s issue of Linux Journal comes in at a hefty 181—that’s 23 articles exploring topics near and dear to everyone from home automation hobbyists to Free Software advocates to hard-core hackers to high-level systems architects. Besides making the magazine bigger overall with more articles in each issue on a wider range of topics, we’ve also added a new feature that explores a given topic in-depth: the Deep Dive—think of it like an ebook inside each magazine. This month contributing editor Petros Koutoupis dives deep in to blockchain. He explores what makes Bitcoin and blockchain so exciting, what they provide, and what the future of blockchain holds. From there, he describes how to set up a private Etherium blockchain using open-source tools and looks at some markets and industries where blockchain technologies can add value. Subscribers, you can download your March issue now. Not a subscriber? It’s not too late. Subscribe today and receive instant access to this and all back issues since 2010. Alternatively, you can buy the single issue here.Original Link

Eric Raymond’s New UPS Project, Ubuntu’s Bionic Beaver 18.04 Beta Released, Kernel Prepatch 4.16-rc5 and More

News briefs for March 12, 2018.

Eric Raymond has started a new project after ranting about the state of the Uninterruptible Power Supply (UPS) market, as The Register reports. The Upside project is hosted on GitLab and “is currently defining requirements and developing a specification for a ‘high quality UPS that can be built from off-the-shelf parts in any reasonably well-equipped makerspace or home electronics shop’.” You can read Eric’s original UPS rant here.

Bionic Beaver 18.04 Beta 1 was released last week and is available for download, including images for Kubuntu, Ubuntu Budgie, Ubuntu Kylin, Ubuntu MATE and Xubuntu. The release announcement notes that “pre-releases of the Bionic Beaver are *not* encouraged for anyone needing a stable system or anyone who is not comfortable running into occasional, even frequent breakage. They are, however, recommended for Ubuntu flavour developers and those who want to help in testing, reporting, and fixing bugs as we work towards getting this release ready.” is giving away a Raspberry Pi arcade gaming kit. Enter to win here. Deadline is March 25, 2018 at 11:59pm EDT.

Kernel prepatch 4.16-rc5 was released yesterday. According to Linus, “This continues to be pretty normal – this rc is slightly larger than rc4 was, but that looks like one of the normal fluctuations due to timing of pull requests, not due to anything distressing.”

Debian stretch 9.4 was released this weekend. The update includes many security and bug fixes, so update now if you haven’t already.

Original Link

diff -u: Linus Posting Habits

A look into how, when and why Linus posts to the kernel mailing list.

Linus Torvalds sometimes is criticized for bombastically cursing out kernel developers. He does do this, but it’s not his default behavior, and I think the real nature of when and how he posts to the mailing list is interesting. For example, he stayed out of the whole discussion of how to replace the BitKeeper revision control system for a long time, letting various projects guess frustratingly at his desires, before he finally took a break from Linux development to design and implement git.

In other cases, he’s allowed developers to lambaste each other savagely for days or longer over key elements of the kernel or the behaviors of new hardware peripherals, only to enter the debate later on, generally to propose a simpler solution that neither camp had thought of.

Sometimes he’ll enter a discussion for apparently no other reason than that a particular bug or piece of code interests him, and he works with whoever posted a given patch to get the kinks out or track down underlying problems.

In general, Linus tends to stay out of most discussions, coming in primarily only on controversial issues after he’s developed a position of his own.

But yes, sometimes he comes down pretty hard on developers, generally saying they already should know better than to do a particular thing, or when he feels the developer is hawking a particular corporate goal that goes against the spirit of open-source development—although I doubt he’d put it that way himself. He doesn’t seem to like making overtly evangelical statements and tends to stick to the “if it works, I like it” attitude that he took when adopting BitKeeper.

Occasionally, Linus gets a technical issue wrong—he’ll claim something about the kernel as being true that isn’t. An example of that happened recently, when Amir Goldstein posted a patch to alter a string hashing function. It was a small patch that preserved some dropped bits in the 64-bit case and left the 32-bit case virtually unchanged. Amir asked for Linus’ advice because he couldn’t find a maintainer for the file, and Linus had been one of the people most recently to change the code.

Linus didn’t agree that Amir’s code left the 32-bit case unchanged. And he singled out a call to the __hash_32() function as being particularly time-consuming. He rejected Amir’s patch, saying, “the patch as-is doesn’t seem to buy anything, and only adds cost.”

But when Amir pointed out that his patch hadn’t actually added the call to __hash_32(), that the call had been there already, Linus took another look and replied, “Oh, duh. My bad. It was indeed there already. Let me go back and look at the history of this thing.”

He later replied to his own post, saying:

After having looked more at it, I take back all my complaints about the patch, you were right and I was mis-reading things or just being stupid.

I also don’t worry too much about the possible performance impact of this on 64-bit, since most architectures that actually care about performance end up not using this very much (the dcache code is the most performance-critical, but the word-at-a-time case uses its own hashing anyway).

So this ends up being mostly used for filesystems that do their own degraded hashing (usually because they want a case-insensitive comparison function).

A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS, and then this potentially makes things more expensive on 64-bit architectures with slow or lacking multipliers even for the normal case.

That said, realistically the only such architecture I can think of is PA-RISC. Nobody really cares about performance on that, it’s more of a ‘look ma, I’ve got warts^W an odd machine’ platform.

So I think your patch is fine, and all my initial worries were just misplaced from not looking at this properly.


To me the interesting details about this kind of post are that Linus never seems to drag out an admission of error. He won’t cling to a mistaken idea out of embarrassment; he also avoids making excuses for why he got the thing wrong or dismissing the opposing side as unimportant. He also seems to put in some real work in developing a proper understanding of the situation after seeing that he made a mistake and moves directly into dealing with whatever the new technical details might be.

I personally like all that. And for me, it adds context to the times when he gets angry at developers. It seems to happen mostly—or only—when he feels that developers should have seen and acknowledged their own error by a given point.

Original Link

Kernel patch releases, WineHQ, OpenIndiana project, FreeBSD Unix distribution, Xubuntu community contest

News briefs for February 19, 2018 In the past couple of days, the Linux kernel saw quite a bit of patch releases. We have 4.15.4, 4.14.20, 4.9.82, 4.4.116 and even 3.18.95. More information can be found at the Linux Kernel Archives website. WineHQ just announced the availability of Wine 3.2 with bug fixes, better gamepad support and more. Wine is an open source compatibility layer designed to run Microsoft Windows applications on top of Unix-like systems. The OpenIndiana project is still alive and well with a recent announcement of migrating the project to GCC 6.4. Unfortunately, this version does not cover the Spectre/Meltdown vulnerabilities, although the next version planned is 7.3 which will cover these hot issues. While on the topic, the FreeBSD Unix distribution finally patched and fixed their operating environment for both Spectre and Meltdown in revision 329462. Attention all artists: the Xubuntu community is asking for submissions in a wallpaper contest for the upcoming 18.04 LTS release. Submission deadlines are March 15, 2018.Original Link

diff -u: Automated Bug Reporting

Bug reports are good. Anyone with a reproducible crash should submit a bug report on the linux-kernel mailing list. The developers will appreciate it, and you’ll be helping make Linux better!

A variety of automated bug-hunters are roaming around reporting bugs. One of them is Syzbot, an open-source tool specifically designed to find bugs in Linux and report them. Dmitry Vyukov recently sent in a hand-crafted email asking for help from the community to make Syzbot even more effective.

The main problems were how to track bugs after Syzbot had reported them and how to tell when a patch went into the kernel to address a given bug.

It turned out that Andrey Ryabinin and Linus Torvalds got together to collaborate on an easy solution for Dmitry’s problem: Syzbot should include a unique identifier in its own email address. The idea is that anything after a “+” in an email address is completely ignored. So is exactly the same as Andrey and Linus suggested that Syzbot use this technique to include a hash value associated with each bug report. Then, Linux developers would include that email address in the “Reported-By” portion of their patch submissions as part of the normal developer process.

Presto! The unique hash would follow the patch around through every iteration.

Other folks had additional feedback about Syzbot. Eric Biggers wanted to see a public-facing user interface, so developers could check the status of bugs, diagnose which versions of kernels were affected and so on. It turned out that Dmitry was hard at work on just such a thing, although he’d need more time before it was ready for public consumption.

And, Eric W. Biederman was utterly disgruntled about several Syzbot deficiencies. For one thing, he felt Syzbot didn’t do a good enough job explaining how to reproduce a given bug. It just reported the problem and went on its merry way. Also, Eric didn’t like the use of the Go language in Syzbot, which he said handled threading in a complex manner that made it difficult to interact in simple ways with the kernel.

But Dmitry assured Eric that the significant parts of Syzbot were written in C++ and that the portions using the Go language were not used for kernel interactions. Dmitry also pointed out that Syzbot did provide information on how to reproduce crashes whenever possible, but that it just wasn’t always possible, and in a lot of cases, the bugs were so simple, it wasn’t even necessary to reproduce them.

In fact, there really wasn’t much discussion. Dmitry’s original problem was solved very quickly, and it appears that Syzbot and its back-end software is under very active development.

Original Link

FOSS Project Spotlight: LinuxBoot

Linux as firmware.

The more things change, the more they stay the same. That may sound cliché, but it’s still as true for the firmware that boots your operating system as it was in 2001 when Linux Journal first published Eric Biederman’s “About LinuxBIOS“. LinuxBoot is the latest incarnation of an idea that has persisted for around two decades now: use Linux as your bootstrap.

On most systems, firmware exists to put the hardware in a state where an operating system can take over. In some cases, the firmware and OS are closely intertwined and may even be the same binary; however, Linux-based systems generally have a firmware component that initializes hardware before loading the Linux kernel itself. This may include initialization of DRAM, storage and networking interfaces, as well as performing security-related functions prior to starting Linux. To provide some perspective, this pre-Linux setup could be done in 100 or so instructions in 1999; now it’s more than a billion.

Oftentimes it’s suggested that Linux itself should be placed at the boot vector. That was the first attempt at LinuxBIOS on x86, and it looked something like this:

#define LINUX_ADDR 0xfff00000; /* offset in memory-mapped NOR flash */
void linuxbios(void) { void(*linux)(void) = (void *)LINUX_ADDR; linux(); /* place this jump at reset vector (0xfffffff0) */

This didn’t get very far though. Linux was not at the point where it could fully bootstrap itself—for example, it lacked functionality to initialize DRAM. So LinuxBIOS, later renamed coreboot, implemented the minimum hardware initialization functionality needed to start a kernel.

Today Linux is much more mature and can initialize much more—although not everything—on its own. A large part of this has to do with the need to integrate system and power management features, such as sleep states and CPU/device hotplug support, into the kernel for optimal performance and power-saving. Virtualization also has led to improvements in Linux’s ability to boot itself.

Firmware Boot and Runtime Components

Modern firmware generally consists of two main parts: hardware initialization (early stages) and OS loading (late stages). These parts may be divided further depending on the implementation, but the overall flow is similar across boot firmware. The late stages have gained many capabilities over the years and often have an environment with drivers, utilities, a shell, a graphical menu (sometimes with 3D animations) and much more. Runtime components may remain resident and active after firmware exits. Firmware, which used to fit in an 8 KiB ROM, now contains an OS used to boot another OS and doesn’t always stop running after the OS boots.

Figure 1. General Overview of Firmware Components and Boot Flow

LinuxBoot replaces the late stages with a Linux kernel and initramfs, which are used to load and execute the next stage, whatever it may be and wherever it may come from. The Linux kernel included in LinuxBoot is called the “boot kernel” to distinguish it from the “target kernel” that is to be booted and may be something other than Linux.

Figure 2. LinuxBoot Components and Boot Flow

Bundling a Linux kernel with the firmware simplifies the firmware in two major ways. First, it eliminates the need for firmware to contain drivers for the ever-increasing variety of boot media, networking interfaces and peripherals. Second, we can use familiar tools from Linux userspace, such as wget and cryptsetup, to handle tasks like downloading a target OS kernel or accessing encrypted partitions, so that the late stages of firmware do not need to (re-)implement sophisticated tools and libraries.

For a system with UEFI firmware, all that is necessary is PEI (Pre-EFI Initialization) and a small number of DXE (Driver eXecution Environment) modules for things like SMM and ACPI table setup. With LinuxBoot, we don’t need most DXE modules, as the peripheral initialization is done by Linux drivers. We also can skip the BDS (Boot Device Selection) phase, which usually contains a setup menu, a shell and various libraries necessary to load the OS. Similarly, coreboot’s ramstage, which initializes various peripherals, can be greatly simplified or skipped.

In addition to the boot path, there are runtime components bundled with firmware that handle errors and other hardware events. These are referred to as RAS (Reliability, Availability and Serviceability) features. RAS features often are implemented as high-priority, highly privileged interrupt handlers that run outside the context of the operating system—for example, in system management mode (SMM) on x86. This brings performance and security concerns that LinuxBoot is addressing by moving some RAS features into Linux. For more information, see Ron Minnich’s ECC’17 talk “Let’s Move SMM out of firmware and into the kernel“.

The initramfs used by the boot kernel can be any Linux-compatible environment that the user can fit into the boot ROM. For ideas, see “Custom Embedded Linux Distributions” published by Linux Journal in February 2018.

Some LinuxBoot developers also are working on u-root (, which is a universal rootfs written in Go. Go’s portability and fast compilation make it possible to bundle the u-root initramfs as source packages with only a few toolchain binaries to build the environment on the fly. This enables real-time debugging on systems (for example, via serial console through a BMC) without the need to recompile or reflash the firmware. This is especially useful when a bug is encountered in the field or is difficult to reproduce.

Advantages of Openness and Using LinuxBoot

While LinuxBoot can benefit those who are familiar with Linux and want more control over their boot flow, companies with large deployments of Linux-based servers or products stand to gain the most. They usually have teams of engineers or entire organizations with expertise in developing, deploying and supporting Linux kernel and userland—their business depends on it after all.

Replacing obscure and often closed firmware code with Linux enables organizations to leverage talent they already have to optimize their servers’ and products’ boot flow, maintenance and support functions across generations of hardware from potentially different vendors. LinuxBoot also enables them to be proactive instead of reactive when tracking and resolving boot-related issues.

LinuxBoot users gain transparency, auditability and reproducibility with boot code that has high levels of access to hardware resources and sets up platform security policies. This is more important than ever with well-funded and highly sophisticated hackers going to extraordinary lengths to penetrate infrastructure. Organizations must think beyond their firewalls and consider threats ranging from supply-chain attacks to weaknesses in hardware interfaces and protocol implementations that can result in privilege escalation or man-in-the-middle attacks.

Although not perfect, Linux offers robust, well-tested and well-maintained code that is mission-critical for many organizations. It is open and actively developed by a vast community ranging from individuals to multi-billion dollar companies. As such, it is extremely good at supporting new devices, the latest protocols and getting the latest security fixes.

LinuxBoot aims to do for firmware what Linux has done for the OS.

Who’s Backing LinuxBoot and How to Get Involved

Although the idea of using Linux as its own bootloader is old and used across a broad range of devices, little has been done in terms of collaboration or project structure. Additionally, hardware comes preloaded with a full firmware stack that is often closed and proprietary, and the user might not have the expertise needed to modify it.

LinuxBoot is changing this. The last missing parts are actively being worked out to provide a complete, production-ready boot solution for new platforms. Tools also are being developed to reorganize standard UEFI images so that existing platforms can be retrofitted. And although the current efforts are geared toward Linux as the target OS, LinuxBoot has potential to boot other OS targets and give those users the same advantages mentioned earlier. LinuxBoot currently uses kexec and, thus, can boot any ELF or multiboot image, and support for other types can be added in the future.

Contributors include engineers from Google, Horizon Computing Solutions, Two Sigma Investments, Facebook, 9elements GmbH and more. They are currently forming a cohesive project structure to promote LinuxBoot development and adoption. In January 2018, LinuxBoot became an official project within the Linux Foundation with a technical steering committee composed of members committed to its long-term success. Effort is also underway to include LinuxBoot as a component of Open Compute Project’s Open System Firmware project. The OCP hardware community launched this project to ensure cloud hardware has a secure, open and optimized boot flow that meets the evolving needs of cloud providers.

LinuxBoot leverages the capabilities and momentum of Linux to open up firmware, enabling developers from a variety of backgrounds to improve it whether they are experts in hardware bring-up, kernel development, tooling, systems integration, security or networking. To join us, visit

About the Authors

David Hendricks and Andrea Barberio are engineers at Facebook whose teams oversee validation, provisioning and sustaining operations for servers. Ron Minnich invented LinuxBIOS, now called coreboot, in 1999. He started the LinuxBoot project at Google in 2017 and is now working with Gan Shun Lim and Christopher Koch to deploy LinuxBoot in Google’s infrastructure.

Original Link

Google Chrome’s Ad Filtering, Intel Expands Bug Bounty Program, GNOME 3.28 Beta and More

News briefs for February 15, 2018.

Starting today, Google Chrome will begin removing ads from sites that don’t follow the Better Ads Standards. For more info on how Chrome’s ad filtering will work, see the Chromium blog.

Intel announced yesterday that it’s expanding its bug bounty program and increasing awards. See the Intel security site or its HackerOne page for more details.

The Cloud Native Computing Foundation (CNCF) has added Vitess as its 16th hosted project. Vitess is a “technology developed by YouTube to shard large MySQL databases across multiple servers”. Read more about the Vitess Architecture and CNCF project requirements on The New Stack.

As reported on Softpedia News, the GNOME 3.28 beta was released yesterday. The release “promises many new features, as well as a wide range of enhancements, especially under the hood as most of the components were ported to the Meson build system.” For more info, see the changelog. The final release is set for March 14, 2018.

Fedora kernel and QA teams have organized a test day for final integration of kernel 4.15. The test day will be Thursday, February 22, 2018, and anyone can participate. If you’re interested, see the wiki for more info.

Original Link

ZFS for Linux

Presenting the Solaris ZFS filesystem, as implemented in Linux FUSE, native kernel modules and the Antergos Linux installer.

ZFS remains one of the most technically advanced and feature-complete filesystems since it appeared in October 2005. Code for Sun’s original Zettabyte File System was released under the CDDL open-source license, and it has since become a standard component of FreeBSD and slowly migrated to various BSD brethren, while maintaining a strong hold over the descendants of OpenSolaris, including OpenIndiana and SmartOS.

Oracle is the owner and custodian of ZFS, and it’s in a peculiar position with respect to Linux filesystems. Btrfs, the main challenger to ZFS, began development at Oracle, where it is a core component of Oracle Linux, despite stability issues Red Hat’s recent decision to deprecate Btrfs likely introduces compatibility and support challenges for Oracle’s Linux road map. Oracle obviously has deep familiarity with the Linux filesystem landscape, having recently released “dedup” patches for XFS. ZFS is the only filesystem option that is stable, protects your data, is proven to survive in most hostile environments and has a lengthy usage history with well understood strengths and weaknesses.

ZFS has been (mostly) kept out of Linux due to CDDL incompatibility with Linux’s GPL license. It is the clear hope of the Linux community that Oracle will re-license ZFS in a form that can be included in Linux, and we should all gently cajole Oracle to do so. Obviously, a re-license of ZFS will have a clear impact on Btrfs and the rest of Linux, and we should work to understand Oracle’s position as the holder of these tools. However, Oracle continues to gift large software projects for independent leadership. Incomplete examples of Oracle’s largesse include OpenOffice and recently Java Enterprise Edition, so it is not inconceivable that Oracle’s generosity may at some point extend additionally to ZFS.

To further this conversation, I want to investigate the various versions of ZFS for Linux. Starting within an RPM-centric environment, I first describe how to install the minimally invasive FUSE implementation, then proceed with a native install of ZFS modules from source. Finally, leaving RPM behind, I proceed to the Antergos distribution that implements native ZFS as a supported installation option.

ZFS Technical Background

ZFS is similar to other storage management approaches, but in some ways, it’s radically different. ZFS does not normally use the Linux Logical Volume Manager (LVM) or disk partitions, and it’s usually convenient to delete partitions and LVM structures prior to preparing media for a zpool.

The zpool is the analog of the LVM. A zpool spans one or more storage devices, and members of a zpool may be of several various types. The basic storage elements are single devices, mirrors and raidz. All of these storage elements are called vdevs.

Mirrored vdevs in a zpool present storage that’s the size of the smallest physical drive. A mirrored vdev can be upgraded (that is, increased in size) by attaching larger drives to the mirrorset and “resilvering” (synchronizing the mirrors), then detaching the smaller drives from the set Resilvering a mirror will involve copying only used blocks to the target device—unused blocks are not touched, which can make resilvering much faster than hardware-maintained disk mirroring (which copies unused storage).

ZFS also can maintain RAID devices, and unlike most storage controllers, it can do so without battery-backed cache (as long as the physical drives honor “write barriers”). ZFS can create a raidz vdev with multiple levels of redundancy, allowing the failure of up to three physical drives while maintaining array availability. Resilvering a raidz also involves only used blocks and can be much faster than a storage controller that copies all disk blocks during a RAID rebuild. A raidz vdev should normally compose 8–12 drives (larger raidz vdevs are not recommended). Note that the number of drives in a raidz cannot be expanded.

ZFS greatly prefers to manage raw disks. RAID controllers should be configured to present the raw devices, never a hardware RAID array. ZFS is able to enforce storage integrity far better than any RAID controller, as it has intimate knowledge of the structure of the filesystem. All controllers should be configured to present “Just a Bunch Of Disks” (JBOD) for best results in ZFS.

Data safety is an important design feature of ZFS. All blocks written in a zpool are aggressively checksummed to ensure the data’s consistency and correctness. You can select the checksum algorithm from sha256, fletcher2 or fletcher4. You also can disable the checksum on user data, which is specifically never recommended (this setting might be useful on a scratch/tmp filesystem where speed is critical, while consistency and recovery are irrelevant; however, sync=disabled is the recommended setting for temporary filesystems in ZFS.

You can change the checksum algorithm at any time, and new blocks will use the updated algorithm. A checksum is stored separately from the data block, with the parent block, in the hope that localized block damage can be detected. If a block is found to disagree with the parent’s checksum, an alternate copy of the block is retrieved from either a mirror or raidz device, rewritten over the bad block, then the I/O is completed without incident. ZFS filesystems can use these techniques to “self-heal” and protect themselves from “bitrot” data changes on hard drive platters that are caused by controller errors, power loss/fluctuations in the read/write heads, and even the bombardment of cosmic rays.

ZFS can implement “deduplication” by maintaining a searchable index of block checksums and their locations. If a new block to be written matches an existing block within the index, the existing block is used instead, and space is saved. In this way, multiple files may share content by maintaining single copies of common blocks, from which they will diverge if any of their content changes. The documentation states that a “dedup-capable checksum” must be set before dedup can be enabled, and sha256 is offered as an example—the checksum must be “collision-resistant” to identify a block uniquely to assure the safety of dedup. Be warned that memory requirements for ZFS expand drastically when deduplication is enabled, which quickly can overwhelm a system lacking sufficient resources.

The zpool can hold datasets, snapshots, clones and volumes. A “dataset” is a standard ZFS filesystem that has a mountpoint and can be modified. A “snapshot” is a point-in-time copy of a filesystem, and as the parent dataset is changed, the snapshot will collect the original blocks to maintain a consistent past image. A “clone” can be built upon a snapshot and allows a different set of changes to be applied to the past image, effectively allowing a filesystem to branch—the clone and original dataset will continue to share unchanged blocks, but otherwise will diverge. A “volume” is similar to a block device, and can be loopback-mounted with a filesystem of any type, or perhaps presented as an iscsi target. Checksums are enforced on volumes. Note that, unlike partitions or logical volumes, elements in a zpool can be intermingled. ZFS knows that the outside edge of a disk is faster than the interior, and it may decide to mix blocks from multiple objects in a zpool at these locations to increase performance. Due to this commingling of filesystems, forensic analysis of zpools is difficult and expensive:

But, no matter how much searching you do, there is [sic] no ZFS recovery tools out there. You are welcome to call companies like Ontrack for data recovery. I know one person that did, and they spent $3k just to find out if their data was recoverable. Then they spent another $15k to get just 200GB of data back.

There are no fsck or defrag tools for ZFS datasets. The boot process never will be delayed because a dataset was not cleanly unmounted. There is a “scrub” tool that will walk a dataset and verify the checksum of every used block on all vdevs, but the scrub takes place on mounted and active datasets. ZFS can recover very well from power losses or otherwise dirty dismounts.

Fragmentation in ZFS is a larger question, and it appears related more to remaining storage capacity than rapid file growth and reduction. Performance of a heavily used dataset will begin to degrade when it is 50% full, and it will dramatically drop over 80% usage when ZFS begins to use “best-fit” rather than “first-fit” to store new blocks. Regaining performance after dropping below 50% usage can involve dropping and resilvering physical disks in the containing vdev until all of the dataset’s blocks have migrated. Otherwise, the dataset should be completely unloaded and erased, then reloaded with content that does not exceed 50% usage (the zfs send and receive utilities are useful for this purpose). It is important to provide ample free disk space to datasets that will see heavy use.

It is strongly encouraged to use ECC memory with ZFS. Error-correcting memory is advised as critical for the correct processing of checksums that maintain zpool consistency. Memory can be altered by system errors and cosmic rays—ECC memory can correct single-bit errors, and panic/halt the system when multi-bit errors are detected. ECC memory is normally found in servers, but becomes somewhat rare with desktops and laptops. Some warn of the “scrub of death” and describe actual lost data from non-ECC RAM. However, one of the creators of ZFS says that all filesystems are vulnerable when non-ECC memory is in use, and ZFS is actually more graceful in failure than most, and further describes undocumented settings that force ZFS to recompute checksums in memory repeatedly, which minimizes dangers from non-ECC RAM. A lengthy configuration guide addresses ZFS safety in a non-ECC environment with these undocumented settings (, but the guide does not appear to cover the FUSE implementation.


The Linux implementation of FUSE received a ZFS port in 2006. FUSE is an interface that allows a filesystem to be implemented by a process that runs in userspace. Fedora has maintained zfs-fuse as an RPM package for some time, but this package does not appear in any of the Red Hat-based distributions, including Oracle Linux. Red Hat appears to have intentionally omitted any relevant RPM for ZFS support.

The FUSE implementation is likely the only way to (currently) use ZFS on Linux in a manner that is fully compliant with both the CDDL and the GPL.

The FUSE port is relatively slow compared to a kernel ZFS implementation. FUSE is not generally installed in a manner that is compatible with NFS, so a zfs-fuse filesystem cannot be exported over the network without preparing a FUSE version with NFS support (NFSv4 might be available if an fsid= is supplied). The zfs-fuse implementation is likely reasonable for local, archival and potentially compressed datasets. Some have used Btrfs for ad-hoc compressed filesystems, and zfs-fuse is certainly an option for similar activity.

The last version of zfs-fuse that will work in Oracle Linux 7.4 is the RPM in Fedora 25. A new ZFS release is in Fedora 26, but it fails to install on Oracle Linux 7.4 due to an OpenSSL dependency—Red Hat’s OpenSSL is now too old. The following shows installing the ZFS RPM:

# rpm -Uvh zfs-fuse-0.7.0-23.fc24.x86_64.rpm
Preparing... ################################# [100%]
Updating / installing... 1:zfs-fuse-0.7.0-23.fc24 ################################# [100%] # cat /etc/redhat-release /etc/oracle-release
Red Hat Enterprise Linux Server release 7.4 (Maipo)
Oracle Linux Server release 7.4

The zfs-fuse userspace agent must be executed before any zpools can be manipulated (note a systemd unit is included for this purpose):

# zfs-fuse

For an easy example, let’s re-task a small hard drive containing a Windows 7 installation:

# fdisk -l /dev/sdb Disk /dev/sdb: 160.0 GB, 160000000000 bytes, 312500000 sectors
Disk label type: dos
Disk identifier: 0x8d206763 Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 206847 102400 7 HPFS/NTFS/exFAT
/dev/sdb2 206848 312496127 156144640 7 HPFS/NTFS/exFAT

It is usually most convenient to dedicate an entire disk to a zpool, so delete all the existing partitions:

# fdisk /dev/sdb
Welcome to fdisk (util-linux 2.23.2). Changes will remain in memory only, until you decide to write them.
Be careful before using the write command. Command (m for help): d
Partition number (1,2, default 2): 2
Partition 2 is deleted Command (m for help): d
Selected partition 1
Partition 1 is deleted Command (m for help): w
The partition table has been altered! Calling ioctl() to re-read partition table.
Syncing disks.

Now a zpool can be added on the drive (note that creating a pool adds a dataset of the same name, which, as you see here, is automatically mounted):

# zpool create vault /dev/sdb # df | awk 'NR==1||/vault/'
Filesystem 1K-blocks Used Available Use% Mounted on
vault 153796557 21 153796536 1% /vault # mount | grep vault
vault on /vault type fuse.zfs

Creating a zpool on non-redundant devices is informally known as “hating your data” and should be contemplated only for demonstration purposes. However, zpools on non-redundant media (for example, flash drives) have obvious data-consistency and compression advantages to VFAT, and the copies parameter can be adjusted for such a dataset to force all blocks to be recorded on the media multiple times (up to three) to increase recoverability.

Mirrored drives can be created with zpool create vault mirror /dev/sdb /dev/sdc. Additional drives can be added as mirrors to an existing drive with zpool attach. A simple RAIDset can be created with zpool create vault raidz /dev/sdb /dev/sdc /dev/sdd.

The standard umount command should (normally) not be used to unmount ZFS datasets—use the zpool/zfs tools instead (note the “unmount” rather than “umount” spelling):

# zfs unmount vault # df | awk 'NR==1||/vault/'
Filesystem 1K-blocks Used Available Use% Mounted on # zfs mount vault # df | awk 'NR==1||/vault/'
Filesystem 1K-blocks Used Available Use% Mounted on
vault 153796557 21 153796536 1% /vault

A ZFS dataset can be mounted in a new location by altering the “mountpoint”:

# zfs unmount vault # mkdir /root/vault # zfs set mountpoint=/root/vault vault # zfs mount vault # df | awk 'NR==1||/vault/'
Filesystem 1K-blocks Used Available Use% Mounted on
vault 153796547 21 153796526 1% /root/vault # zfs unmount vault # zfs set mountpoint=/vault vault # zfs mount vault # df | awk 'NR==1||/vault/'
Filesystem 1K-blocks Used Available Use% Mounted on
vault 153796547 21 153796526 1% /vault

The mountpoint is retained and is persistent across reboots.

Creating an additional dataset (and mounting it) is as easy as creating a directory (note this command can take some time):

# zfs create vault/tmpdir # df | awk 'NR==1||/(vault|tmpdir)/'
Filesystem 1K-blocks Used Available Use% Mounted on
vault 153796496 800 153795696 1% /vault
vault/tmpdir 153795717 21 153795696 1% /vault/tmpdir # cp /etc/yum.conf /vault/tmpdir/ # ls -l /vault/tmpdir/
-rw-r--r--. 1 root root 813 Sep 23 16:47 yum.conf

ZFS supports several types of compression in a dataset. Gzip of varying degrees, zle and lzjb can all be present in a single mountpoint. The checksum algorithm also can be adjusted on the fly:

# zfs get compress vault/tmpdir
vault/tmpdir compression off local # zfs get checksum vault/tmpdir
vault/tmpdir checksum on default # zfs set compression=gzip vault/tmpdir # zfs set checksum=fletcher2 vault/tmpdir # cp /etc/redhat-release /vault/tmpdir # zfs set compression=zle vault/tmpdir # zfs set checksum=fletcher4 vault/tmpdir # cp /etc/oracle-release /vault/tmpdir # zfs set compression=lzjb vault/tmpdir # zfs set checksum=sha256 vault/tmpdir # cp /etc/os-release /vault/tmpdir

Note that the GZIP compression factor can be adjusted (the default is six, just as in the GNU GZIP utility). This will directly impact the speed and responsiveness of a dataset:

# zfs set compression=gzip-1 vault/tmpdir # cp /etc/profile /vault/tmpdir # zfs set compression=gzip-9 vault/tmpdir # cp /etc/grub2.cfg /vault/tmpdir # ls -l /vault/tmpdir
-rw-r--r--. 1 root root 6308 Sep 23 17:06 grub2.cfg
-rw-r--r--. 1 root root 32 Sep 23 17:00 oracle-release
-rw-r--r--. 1 root root 398 Sep 23 17:00 os-release
-rw-r--r--. 1 root root 1795 Sep 23 17:05 profile
-rw-r--r--. 1 root root 52 Sep 23 16:59 redhat-release
-rw-r--r--. 1 root root 813 Sep 23 16:58 yum.conf

Should the dataset no longer be needed, it can be dropped:

# zfs destroy vault/tmpdir # df | awk 'NR==1||/(vault|tmpdir)/'
Filesystem 1K-blocks Used Available Use% Mounted on
vault 153796523 800 153795723 1% /vault

You can demonstrate a recovery in ZFS by copying a few files and creating a snapshot:

# cp /etc/passwd /etc/group /etc/shadow /vault # ls -l /vault
-rw-r--r--. 1 root root 965 Sep 23 14:41 group
-rw-r--r--. 1 root root 2269 Sep 23 14:41 passwd
----------. 1 root root 1255 Sep 23 14:41 shadow # zfs snapshot vault@goodver # zfs list -t snapshot
vault@goodver 0 - 27K -

Then you can simulate more file manipulations that involve the loss of a critical file:

# rm /vault/shadow
rm: remove regular file '/vault/shadow'? y # cp /etc/resolv.conf /etc/nsswitch.conf /etc/services /vault/ # ls -l /vault
-rw-r--r--. 1 root root 965 Sep 23 14:41 group
-rw-r--r--. 1 root root 1760 Sep 23 16:14 nsswitch.conf
-rw-r--r--. 1 root root 2269 Sep 23 14:41 passwd
-rw-r--r--. 1 root root 98 Sep 23 16:14 resolv.conf
-rw-r--r--. 1 root root 670311 Sep 23 16:14 services

Normally, snapshots are visible in the .zfs directory of the dataset. However, this functionality does not exist within the zfs-fuse implementation, so you are forced to create a clone to retrieve your lost file:

# zfs clone vault@goodver vault/history # ls -l /vault/history
-rw-r--r--. 1 root root 965 Sep 23 14:41 group
-rw-r--r--. 1 root root 2269 Sep 23 14:41 passwd
----------. 1 root root 1255 Sep 23 14:41 shadow

Note that the clone is not read-only, and you can modify it. The two mountpoints will maintain a common set of blocks, but are otherwise independent:

# cp /etc/fstab /vault/history # ls -l /vault/history
-rw-r--r--. 1 root root 541 Sep 23 16:23 fstab
-rw-r--r--. 1 root root 965 Sep 23 14:41 group
-rw-r--r--. 1 root root 2269 Sep 23 14:41 passwd
----------. 1 root root 1255 Sep 23 14:41 shadow

Assuming that you have completed your recovery activity, you can destroy the clone and snapshot. A scrub of the parent dataset to verify its integrity at that point might be wise, and then you can list your zpool history to see evidence of your session:

# zfs destroy vault/history # zfs destroy vault@goodver # zpool scrub vault # zpool status vault pool: vault state: ONLINE scrub: scrub in progress for 0h1m, 30.93% done, 0h3m to go
config: NAME STATE READ WRITE CKSUM vault ONLINE 0 0 0 sdb ONLINE 0 0 0 errors: No known data errors # zpool history vault

For my final words on zfs-fuse, I’m going to list the software version history for zpool and zfs. Note: it is critical that you create your zpools with the lowest ZFS version that you wish to use, which in this case is zpool version 23 and zfs version 4:

# zpool upgrade -v
This system is currently running ZFS pool version 23. The following versions are supported: VER DESCRIPTION
--- -------------------------------------------------------- 1 Initial ZFS version 2 Ditto blocks (replicated metadata) 3 Hot spares and double parity RAID-Z 4 zpool history 5 Compression using the gzip algorithm 6 bootfs pool property 7 Separate intent log devices 8 Delegated administration 9 refquota and refreservation properties 10 Cache devices 11 Improved scrub performance 12 Snapshot properties 13 snapused property 14 passthrough-x aclinherit 15 user/group space accounting 16 stmf property support 17 Triple-parity RAID-Z 18 Snapshot user holds 19 Log device removal 20 Compression using zle (zero-length encoding) 21 Deduplication 22 Received properties 23 Slim ZIL # zfs upgrade -v
The following filesystem versions are supported: VER DESCRIPTION
--- -------------------------------------------------------- 1 Initial ZFS filesystem version 2 Enhanced directory entries 3 Case insensitive and File system unique identifier (FUID) 4 userquota, groupquota properties

Native ZFS

You can obtain a zfs.ko kernel module from the ZFS on Linux site and load into Linux, which will provide high-performance ZFS with full functionality. In order to install this package, you must remove the FUSE version of ZFS (assuming it was installed as in the previous section):

# rpm -e zfs-fuse
Removing files since we removed the last package

After the FUSE removal, you need to install a new yum repository on the target system. ZFS on a Red Hat-derivative likely will require network access to the ZFS repository (standalone installations will be more difficult and are not covered here):

# yum install \
==================================================================== Package Repository Size
Installing: zfs-release /zfs-release.el7_4.noarch 2.9 k ====================================================================
Install 1 Package Total size: 2.9 k
Installed size: 2.9 k
Is this ok [y/d/N]: y
... Installed: zfs-release.noarch 0:1-5.el7_4 Complete!

After configuring the repository, load the GPG key:

# gpg --quiet --with-fingerprint /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux
pub 2048R/F14AB620 2013-03-21 ZFS on Linux Key fingerprint = C93A FFFD 9F3F 7B03 C310 CEB6 A9D5 A1C0 F14A B620
sub 2048R/99685629 2013-03-21

At this point, you’re are ready to proceed with a native ZFS installation.

The test system used here, Oracle Linux 7.4, normally can boot from one of two kernels. There is a “Red Hat-Compatible Kernel” and also an “Unbreakable Enterprise Kernel” (UEK). Although the FUSE version is completely functional under both kernels, the native ZFS installer does not work with the UEK (meaning further that Oracle Ksplice is precluded with the standard ZFS installation). If you are running Oracle Linux, you must be booted on the RHCK when manipulating a native ZFS configuration, and this includes the initial install. Do not attempt installation or any other native ZFS activity while running the UEK:

# rpm -qa | grep ^kernel | sort

The ZFS installation actually uses yum to compile C source code in the default configuration (DKMS), then prepares an initrd with dracut (use top to monitor this during the install). This installation will take some time, and there are notes on using a pre-compiled zfs.ko collection in an alternate installation configuration (kABI). The test platform used here is Oracle Linux, and the Red Hat-Compatible Kernel may not be fully interoperable with the precompiled zfs.ko collection (not tested while preparing this article), so the default DKMS build was retained. Here’s an example installation session:

# yum install kernel-devel zfs
==================================================================== Package Repository Size
Installing: zfs zfs 405 k
Installing for dependencies: dkms epel 78 k libnvpair1 zfs 29 k libuutil1 zfs 35 k libzfs2 zfs 129 k libzpool2 zfs 587 k spl zfs 29 k spl-dkms zfs 454 k zfs-dkms zfs 4.9 M ====================================================================
Install 1 Package (+8 Dependent packages) Total download size: 6.6 M
Installed size: 29 M
Is this ok [y/d/N]: y
... - Installing to /lib/modules/3.10.0-693.2.2.el7.x86_64/extra/
icp.ko: Installed: zfs.x86_64 0:0.7.1-1.el7_4 Complete!

After the yum session concludes, you can load the native zfs.ko into the “RHCK” Linux kernel, which will pull in a number of dependent modules:

# modprobe zfs # lsmod | awk 'NR==1||/zfs/'
Module Size Used by
zfs 3517672 0
zunicode 331170 1 zfs
zavl 15236 1 zfs
icp 266091 1 zfs
zcommon 73440 1 zfs
znvpair 93227 2 zfs,zcommon
spl 102592 4 icp,zfs,zcommon,znvpair

At this point, the pool created by FUSE can be imported back into the system (note the error):

# /sbin/zpool import vault
cannot import 'vault': pool was previously in use from another system.
Last accessed at Sun Sep 24 2017
The pool can be imported, use 'zpool import -f' to import the pool. # /sbin/zpool import vault -f

The import will mount the dataset automatically:

# ls -l /vault
-rw-r--r--. 1 root root 965 Sep 23 14:41 group
-rw-r--r--. 1 root root 1760 Sep 23 16:14 nsswitch.conf
-rw-r--r--. 1 root root 2269 Sep 23 14:41 passwd
-rw-r--r--. 1 root root 98 Sep 23 16:14 resolv.conf
-rw-r--r--. 1 root root 670311 Sep 23 16:14 services

You can create a snapshot, then delete another critical file:

# /sbin/zfs snapshot vault@goodver # rm /vault/group
rm: remove regular file '/vault/group'? y

At this point, you can search the /vault/.zfs directory for the missing file (note that .zfs does not appear with ls -a, but it is present nonetheless):

# ls -la /vault
drwxr-xr-x. 2 root root 6 Sep 25 17:47 .
dr-xr-xr-x. 19 root root 4096 Sep 25 17:17 ..
-rw-r--r--. 1 root root 1760 Sep 23 16:14 nsswitch.conf
-rw-r--r--. 1 root root 2269 Sep 23 14:41 passwd
-rw-r--r--. 1 root root 98 Sep 23 16:14 resolv.conf
-rw-r--r--. 1 root root 670311 Sep 23 16:14 services # ls -l /vault/.zfs
dr-xr-xr-x. 2 root root 2 Sep 23 13:54 shares
drwxrwxrwx. 2 root root 2 Sep 25 17:47 snapshot # ls -l /vault/.zfs/snapshot/
drwxr-xr-x. 2 root root 7 Sep 24 18:58 goodver # ls -l /vault/.zfs/snapshot/goodver
-rw-r--r--. 1 root root 965 Sep 23 14:41 group
-rw-r--r--. 1 root root 1760 Sep 23 16:14 nsswitch.conf
-rw-r--r--. 1 root root 2269 Sep 23 14:41 passwd
-rw-r--r--. 1 root root 98 Sep 23 16:14 resolv.conf
-rw-r--r--. 1 root root 670311 Sep 23 16:14 services

Native ZFS implements newer software versions of zpool and zfs—remember, it is critical that you create your zpools with the lowest ZFS version that you ever intend to use, which in this case is zpool version 28, and zfs version 5. The FUSE version is far simpler to install on a fresh Red Hat OS for recovery purposes, so consider carefully before upgrading to the native ZFS versions:

# /sbin/zpool upgrade -v
... 23 Slim ZIL 24 System attributes 25 Improved scrub stats 26 Improved snapshot deletion performance 27 Improved snapshot creation performance 28 Multiple vdev replacements # /sbin/zfs upgrade -v
... 4 userquota, groupquota properties 5 System attributes

Strong words of warning should accompany the use of native ZFS on a Red Hat-derivative.

Kernel upgrades are a cause for concern. If the zfs.ko family of modules are not installed correctly, then no pools can be brought online. For this reason, it is far more imperative to retain known working kernels when upgraded kernels are installed. As I’ve noted previously, Oracle’s UEK is not ZFS-capable when using the default native installation.

OS release upgrades also introduce even more rigorous warnings. Before attempting an upgrade, remove all of the ZFS software. Upon upgrade completion, repeat the ZFS software installation using a yum repository that is specific for the new OS release. The ZFS on Linux site currently lists repositories for Red Hat releases 6, 7.3 and 7.4. It is wise to stay current on patches and releases, and strongly consider upgrading a 7.0 – 7.2 Red Hat-derivative where native ZFS installation is contemplated or desired.

Note also that Solaris ZFS has encryption and Windows SMB capability—these are not functional in the Linux port.

Perhaps someday Oracle will permit the Red Hat family to bundle native ZFS by relaxing the license terms. That will be a very good day.


Definite legal ambiguity remains with ZFS. Although Ubuntu recently announced support for the zfs.ko module for its container subsystem, its legal analysis remains murky. Unsurprisingly, none of the major enterprise Linux distributions have been willing to bundle ZFS as a first-class supported filesystem.

Into this void comes Antergos, a descendant of Arch Linux. The Antergos installer will download and compile ZFS source code into the installation kernel in a manner similar to the previous section. Although the example installation detailed here did not proceed without incident, it did leave a working, mirrored zpool for the root filesystem running the same version release as the native RPM installs.

What Antergos did not do was install the Linux kernel itself to both drives. A separate ext4 partition was configured for /boot on only one drive, because Grub2 does not support ZFS, and there appears to be a current lack of alternatives for booting Linux from a ZFS dataset. I had expected to see an installation similar to MirrorDisk/UX for HP-UX, where the firmware is configured with primary and alternate boot paths, and the OS is intelligent enough to manage identical copies of the boot and root filesystems on multiple drives. What I actually found was the root filesystem mirrored by ZFS, but the kernel in /boot is not, nor is the system bootable if the single ext4 /boot partition fails. A fault-tolerant Antergos installation will require RAID hardware—ZFS is not sufficient.

You can download the Antergos Live ISO and write it as a bootable image to a flash drive with the command:

# dd bs=4M if=antergos-17.9-x86_64.iso of=/dev/sdc

Note that the Antergos Minimal ISO does not support ZFS; it’s only in the Live ISO. Internet access is required while the installer is running. The latest packages will be downloaded in the installer session, and very little is pulled from the ISO media.

After booting your system on the live ISO, ensure that you are connected to the internet and activate the installer dialog. Note the warnings of beta software status—whether this refers to ZFS, Btrfs or other Linux RAID configurations is an open question.

Figure 1. Installer Warning

Select your territory or locale, time zone, keyboard layout (I suggest the “euro on 5”), and choose your desktop environment. After I chose GNOME, I also added Firefox and the SSH Service. Finally, a ZFS option is presented—enable it (Figure 2).

Figure 2. Toggle ZFS

As Figure 3 shows, I configured two SATA drives in a zpool mirror. I named the pool “root”, which may have caused an error at first boot. Note also the 4k block size toggle—this is a performance-related setting that might be advisable for some configurations and usage patterns.

Figure 3. Configure the zpool

The next pages prompt for the final confirmation before the selected drives are wiped, after which you will be prompted to create a default user.

While the installer is running, you can examine the zpool. After opening a terminal and running sudo sh, I found the following information about the ZFS configuration:

sh-4.4# zpool history
History for 'root': 2017-09-30 16:10:28
zpool create -f -m /install root mirror /dev/sda2 /dev/sdb
zpool set bootfs=root root
zpool set cachefile=/etc/zfs/zpool.cache root
zfs create -V 2G root/swap
zfs set com.sun:auto-snapshot=false root/swap
zfs set sync=always root/swap
zpool export -f root
zpool import -f -d /dev/disk/by-id -R /install 13754361671922204858

Note that /dev/sda2 has been mirrored to /dev/sdb, showing that Antergos has installed a zpool on an MBR partition. More important, these drives are not configured identically. This is not a true redundant mirror with the ability to boot from either drive.

After fetching and installing the installation packages, Antergos will build zfs.ko. You can see the calls to gcc if you run the top command in a terminal window.

Figure 4. Building ZFS

My installation session completed normally, and the system rebooted. GRUB presented me with the Antergos boot splash, but after booting, I was thrown into single-user mode:

starting version 234
ERROR: resume: no device specified for hibernation
ZFS: Unable to import pool root.
cannot import 'root': pool was previously in use from another system.
Last accessed by <unknown> (hostid=0) at Tue Oct 3 00:06:34 2017
The pool can be imported, use 'zpool import -f' to import the pool.
ERROR: Failed to mount the real root device.
Bailing out, you are on your own. Good luck. sh: can't access tty; job control turned off [rootfs ]# zpool import -f root
cannot mount '/': directory is not empty
[rootfs ]# zfs create root/hold
[rootfs ]# cat /dev/vcs > /hold/vcs.txt

The zpool import error above also was encountered when the FUSE pool was imported by the native driver. I ran the force import (zpool import -f root), which succeeded, then created a new dataset and copied the terminal to it, so you can the session here. After a Ctrl-Alt-Delete, the system booted normally. Naming the zpool “root” in the installer may have caused this problem.

My test system does not have ECC memory, so I attempted to adjust the undocumented kernel parameter below, followed by a reboot:

echo options zfs zfs_flags=0x10 >> /etc/modprobe.d/zfs.conf

After the test system came up, I checked the flags and found that the ECC memory feature had not been set. I set it manually, then ran a scrub:

# cat /sys/module/zfs/parameters/zfs_flags
0 # echo 0x10 > /sys/module/zfs/parameters/zfs_flags # cat /sys/module/zfs/parameters/zfs_flags
16 # zpool scrub root # zpool status root pool: root state: ONLINE scan: scrub in progress since Sun Oct 1 12:08:50 2017 251M scanned out of 5.19G at 25.1M/s, 0h3m to go 0B repaired, 4.72% done
config: NAME STATE READ WRITE CKSUM root ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 wwn-0x5000cca20cda462e-part2 ONLINE 0 0 0 wwn-0x5000c5001a0d9823 ONLINE 0 0 0 errors: No known data errors

I also found that the kernel and initrd do not incorporate version numbers in their filenames, indicating that an upgrade may overwrite them. It likely will be wise to copy them to alternate locations within boot to ensure that a fallback kernel is available (this would need extra menu entries in GRUB):

# ls -l /boot
-rw-r--r-- 1 root root 26729353 Sep 30 17:25 initramfs-linux-fallback.img
-rw-r--r-- 1 root root 9225042 Sep 30 17:24 initramfs-linux.img
-rw-r--r-- 1 root root 5474064 Sep 21 13:34 vmlinuz-linux

You can continue your investigation into the Antergos zpool mirror by probing the drives with fdisk:

sh-4.4# fdisk -l /dev/sda
Disk /dev/sda: 232.9 GiB, 250059350016 bytes, 488397168 sectors
Disklabel type: dos Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 1048575 1046528 511M 83 Linux
/dev/sda2 1048576 488397167 487348592 232.4G 83 Linux sh-4.4# fdisk -l /dev/sdb
Disk /dev/sdb: 149 GiB, 160000000000 bytes, 312500000 sectors
Disklabel type: gpt Device Start End Sectors Size Type
/dev/sdb1 2048 312481791 312479744 149G Solaris /usr & Apple ZFS
/dev/sdb9 312481792 312498175 16384 8M Solaris reserved 1

Antergos appears to be playing fast and loose with the partition types. You also can see that the /boot partition is a non-redundant ext4:

# grep -v ^# /etc/fstab
UUID=f9fc... /boot ext4 defaults,relatime,data=ordered 0 0
/dev/zvol/root/swap swap swap defaults 0 0 # df|awk 'NR==1||/boot/'
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 498514 70732 418454 15% /boot

Antergos is not configuring a completely fault-tolerant drive mirror, and this is a known problem. The ext4 partition holding the kernel is a single point of failure, apparently required for GRUB. In the event of the loss of /boot, the Live ISO could be used to access the zpool, but restoring full system availability would require much more effort. The same likely will apply to raidz.


ZFS is the filesystem that is “often imitated, never duplicated”.

The main contenders for ZFS functionality appear to be Btrfs, Apple APFS and Microsoft’s ReFS. After many years of Btrfs development, it still lacks performance and maturity (“we are still refusing to support ‘Automatic Defragmentation’, ‘In-band Deduplication’ and higher RAID levels, because the quality of these options is not where it ought to be”). Apple very nearly bundled ZFS into OS X, but backed out and produced APFS instead Microsoft is also trying to create a next-generation filesystem named ReFS, but in doing so it is once again proving Henry Spencer’s famous quote, “Those who do not understand Unix are condemned to reinvent it, poorly.” ReFS will lack compression, deduplication and copy-on-write snapshots.

All of us have critical data that we do not wish to lose. ZFS is the only filesystem option that is stable, protects our data, is proven to survive in most hostile environments and has a lengthy usage history with well understood strengths and weaknesses. Although many Linux administrators who need its features likely will load ZFS, the installation and maintenance tools have obvious shortcomings that can trap the unwary.

It is time once again to rely on Oracle’s largesse and ask them to open the ZFS filesystem fully to Linux for the benefit of the community. This will solve many problems, including Oracle’s, and it will engender goodwill in the Linux community that, at least from a filesystem perspective, is sorely lacking.


The views and opinions expressed in this article are those of the author and do not necessarily reflect those of Linux Journal.

Original Link

Linus Announces Linux 4.16 rc, KDE’s New Slimbook II, Damn Cool Editor Project and More

News briefs for February 12, 2018.

Linus Torvalds announced yesterday that Linux 4.16 rc1 is “out there”, the merge window has closed and things “look a lot better than with 4.15”. He went on to describe some changes: “Drivers may be the bulk (GPU, networking, staging, media, sound, infiniband, scsi and misc smaller subsystems), but we have a fair amount of arch updates (spectre and meltdown fixes for non-x86 architectures, but also some further x86 work, and just general arch updates). And there’s networking, filesystem updates, documentation, tooling….There’s a little bit for everybody, in other words.”

KDE announced its new Slimbook last week. The new Slimbook II “comes with a choice between an Intel i5: 2.5 GHz Turbo Boost 3.1 GHz – 3M Cache CPU, or an Intel i7: 2.7 GHz Turbo Boost 3.5 GHz with a 4M Cache. This makes the KDE Slimbook II 15% faster on average than its predecessor. The RAM has also been upgraded, and the KDE Slimbook now sports 4, 8, or 16 GBs of DDR4 RAM which is 33% faster than the DDR3 RAM installed on last year’s model.” The full list of specs is available here.

Damn Cool Editor project announced version 0.13 last week: “It’s a cool combination of an old fashioned tool and clever stuff for advanced users thinking that modern IDEs are just bloated toys with far too little focus on the main thing: your code.”

Scheduled for November 13–14, 2018, the Linux Plumbers Conference (in Vancouver) is already asking for networking-related proposals. More info is available here.

Note that SourceForge and Slashdot are switching over to a new data center this week (February 12–15) for a hardware refresh, and although they don’t anticipate issues, they want the community to be aware just in case.

Thanks to Petros Koutoupis for his contribution to this article.

Original Link

diff -u: Adding Encryption To printk()

When is security not security? When it guards against the wrong people or against things that never happen. A useless security measure is just another batch of code that might contain an exploitable bug. So the Linux developers always want to make sure a security patch is genuinely useful before pulling it in.

A patch from Dan Aloni recently came in to create the option to encrypt printk() output. This would make all dmesg information completely inaccessible to users, including hostile attackers. His idea was that the less information available to hostile users, the better.

The problem with this, as Steven Rostedt pointed out, was that it was essentially just a way for device makers and Linux distributions to shut out users from meaningfully understanding what their systems were doing. On the other hand, Steven said, he wouldn’t be opposed to including an option like that if a device maker or Linux distribution actually would find it legitimately useful.

He asked if anyone on the mailing list was part of a group that wanted such a feature, but no one stepped forward to defend it. On the contrary, Daniel Micay, an Android security contributor who was not part of the official Android development team, said that Android already prevented users from seeing dmesg output, using the SELinux module. So, Dan’s patch would be redundant in that case.

The mailing list discussion petered out around there. Maybe the goal of the patch after all was not about protecting users from hostile attackers, but about protecting vendors from users who want control of their systems.

The reason I sometimes write about these patch submissions that go nowhere is that the reasons they go nowhere are always interesting, and they also help me better understand the cases where patches come in and are accepted.

Original Link

diff -u: Detainting the Kernel

What’s new in kernel development: detainting the kernel.

Sometimes someone submits a patch without offering a clear explanation of why the patch would be useful, and when questioned by the developers, the person offers vague or hypothetical explanations. Something like that happened recently when Matthew Garrett submitted a patch to disable a running kernel’s ability to detect whether it was running entirely open-source code.

Specifically, he wanted to be able to load unsigned modules at runtime, without the kernel detecting the situation and “tainting” itself. Tainting the kernel doesn’t affect its behavior in any significant way, but it is extremely useful to the kernel developers, who typically will refuse to chase bug reports on any kernel that uses closed-source software. Without a fully open-source kernel, there’s no way to know that a given bug is inside the open or closed portion of the kernel. For this reason, anyone submitting bug reports to the kernel developers always should make sure to reproduce the bug on an untainted kernel.

Matthew’s patch would make it impossible for developers to know whether a kernel had or had not been tainted, and this could result in many wasted hours chasing bugs on kernels that should have been tainted.

So, why did Matthew want this patch in the kernel? It never was made clear. At times he seemed to suggest that the patch was simply a way to avoid having users complain about their kernel being tainted when it shouldn’t have been. At one point Ben Hutchings suggested that Matthew might want to allow third parties to sign modules on their own for some reason.

But as no one was able to get real clarity on the reason for the patch, and as tainting the kernel is traditionally a good way to avoid chasing down bugs in closed-source code, none of the developers seemed anxious to accept Matthew’s patch.

Original Link

OpenWall’s LKRG, Hitachi Joins the Open Invention Network, xcp-ng Kickstarter Campaign and More

News updates for February 6, 2018.

OpenWall recently announced the Linux Kernel Runtime Guard (LKRG), which is “a loadable kernel module that performs runtime integrity checking of the Linux kernel and detection of security vulnerability exploits against the kernel. As controversial as this concept is, LKRG attempts to post-detect and hopefully promptly respond to unauthorized modifications to the running Linux kernel (integrity checking) or to credentials (such as user IDs) of the running processes (exploit detection).” See the wiki for more information, and support the project on Patreon.

Hitachi has joined the Open Invention Network, “the largest patent non-aggression community in history”. According to Norihiro Suzuki, Vice President and Executive Officer, CTO of Hitachi, “Open source technology, especially Linux, drives innovation in areas that are critical to the customers that we serve, including technologies such as servers, storage, cloud, converged applications, big data and IoT. By joining Open Invention Network, we are demonstrating our continued commitment to open source technology, and supporting it with patent non-aggression in Linux.” See the press release for more information.

Recently we reported on the Olivier Lambert’s revival of the Xen Cloud Platform (XCP) virtualization platform. In an attempt to raise some capital for the project, xcp-ng, Lambert started a Kickstarter campaign.

The GNOME project just announced the release of WebKitGTK+ 2.19.90. WebKitGTK+ is the GNOME platform port of the WebKit rendering engine.

An update of Peppermint OS, a lightweight distro based on Lubuntu, was released yesterday This version “is a security refresh of the Peppermint 8 ISO images to include all updates to date (as of 3rd Feb 2018), including the Meltdown and Spectre mitigations such as the new HWE kernel 4.13.0-32 and the latest Chromium web browser version 64.”

Thanks to Petros Koutoupis for his contributions to this article.

Original Link

Meltdown/Spectre Status for Red Hat and Oracle

The Red Hat family of operating systems addressed Meltdown and Spectre in its v3.10 kernel quickly, but relied too much upon Intel’s flawed microcode and was forced to revert from a complete solution. Oracle implemented alternate approaches more suited to its v4.1 UEK, but both kernels continue to lack full Spectre coverage while they wait for Intel. Conspicuously absent from either Linux branch is Google’s retpoline, which offers far greater and more efficient coverage for all CPUs. Auditing this status is a challenge. This article presents the latest tools for vulnerability assessments.

A frenzy of patch activity has surrounded this year’s Meltdown and Spectre CPU vulnerability disclosures. Normally quiet microcode packages for Intel chips have seen four updates in the month of January, one of which was finally to roll back flawed code that triggers random reboots. For enterprise-grade hardware, Intel’s quality control has left much to be desired.

It is likely premature to deploy new monitoring and compliance tools, and a final solution for this set of vulnerabilities will wait until correct microcode is obtained. Still, it may be important for many organizations to evaluate the patch status of servers running Linux kernels packaged by Oracle and/or Red Hat.

Meltdown patches exist now and should be deployed immediately on vulnerable servers. Remediating all Spectre vulnerabilities requires not only the latest kernels, but also a patched GCC to compile the kernel that is capable of implementing “retpolines”, or compatible microcode from your CPU vendor.


Red Hat was one of the first Linux distributions to publish guidance on Meltdown and Spectre. It established three files as “kernel tunables” in the /sys/kernel/debug/x86 directory to monitor and control these patches: pti_enabled for Meltdown, ibpb_enabled for Spectre v1 and ibrs_enabled for Spectre v2. Only the root user can access these files.

When these files contain a numerical zero, the patches are not active. If allowed for the CPU, a numerical one may be written to the file to enable the relevant remediation, and a zero may be written later to disable it. This is not always allowed—AMD processors are not vulnerable to Meltdown, and the value in the pti_enabled file is locked to zero and cannot be changed. If the fixes are active and show 1, the performance of the CPU may be reduced. Compatible microcode is required to enable all patches on vulnerable CPUs, which adds new assembler/machine language op codes that erase vulnerable kernel data from CPU pipelines and caches to close the exploit.

It is not generally understood that, although the BIOS is responsible for providing a base microcode image, the Linux kernel is able to update some CPUs at boot with a volatile, runtime upgrade for Intel microcode. The update must come from the CPU vendor, carrying its digital signature; it cannot be produced independently by the OS maintainers. This is accomplished on Intel CPUs with the help of the following RPM (microcode for AMD CPUs is contained within the larger “linux-firmware” package):

# rpm -qi microcode_ctl
Name : microcode_ctl
Epoch : 2
Version : 2.1
Release :
Architecture: x86_64
Install Date: Sun 21 Jan 2018 12:58:36 PM CST
Group : System Environment/Base
Size : 1001060
License : GPLv2+ and Redistributable, no modification permitted
Signature : RSA/SHA256, Sat 20 Jan 2018 08:32:03 PM CST, Key ID
Source RPM : microcode_ctl-2.1-
Build Date : Sat 20 Jan 2018 08:31:55 PM CST
Build Host :
Relocations : (not relocatable)
Vendor : Oracle America
Summary : Tool to transform and deploy CPU microcode update for x86.
Description :
The microcode_ctl utility is a companion to the microcode driver written
by Tigran Aivazian <>. The microcode update is volatile and needs to be uploaded on each system
boot i.e. it doesn't reflash your cpu permanently, reboot and it reverts
back to the old microcode.

Red Hat has “washed its hands” of Intel’s recent flawed microcode, reverted to previous packages, and is advising customers to obtain BIOS updates for Spectre remediation. Although Red Hat very likely will publish new Intel microcode updates again at some point in the distant future, there is clear and obvious frustration that could delay full security compliance.


Oracle distributes the Unbreakable Enterprise Kernel (UEK) as its recommended Linux. Where used within its distribution, it has retroactively renamed the Red Hat-Compatible Kernel (RHCK), which now contains the files in /sys/kernel/debug/x86.

The UEK is a popular choice in enterprise Linux, and it brings many new features to RPM distributions. It is the easiest and most stable method to implement a v4.1 performance-tuned kernel in a supported manner on a Red Hat-derivative OS (which usually runs v3.10 at best).

The UEK is especially notable as it works with Ksplice (as does the RHCK), which allows for online kernel patches and upgrades without rebooting, and it even can extend this ability to select userland libraries (glibc and OpenSSL). This service is offered for free to Fedora and Ubuntu users, and as a paid-support option for both CentOS (after a scripted conversion to Oracle Linux) and Red Hat. Ksplice is the oldest and most mature high-availability option for Linux security upgrades.

The UEK implements a different set of world-readable monitor files, and they do not appear to allow adjustment. This is not generally well known (the information was obtained from Oracle support).

Oracle suggests that you can learn the Meltdown and Spectre status under a running UEK from the following:

# awk '{print FILENAME":"$0}' /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Not affected

The UEK also adds commentary to /proc/cpuinfo regarding the bug status of the running CPU (which does not appear when booted on the RHCK):

# egrep '(model name|bugs)' /proc/cpuinfo | sort -u
bugs : cpu_meltdown spectre_v2
model name : Intel(R) Core(TM)2 Duo CPU E4600 @ 2.40GHz

Oracle further recommends that Document ID 2348448.1 be consulted for full details on Meltdown and Spectre remediation under Linux. Oracle Solaris is also vulnerable and is addressed in the latest Critical Patch Update. Note that, since updated microcode will be required to remediate Spectre on supported Intel CPUs, a reboot will be required even for systems on Ksplice.


Stephane Lesimple published an elaborate detection script on Github The script is a tour de force that tests for all known mitigations and microcode. The latest version is now (mostly) able to detect the UEK properly, and it has output options for JSON and NRPE. This is likely to remain the authoritative audit tool for speculative attacks on Linux for the immediate future.

Lesimple’s script also probes for the presence of “retpolines” in the kernel. Retpolines concern a CPU’s “indirect addressing modes” when a register is involved in computing the target of a branch (“jump tables” often are mentioned). The retpoline essentially plays out the Spectre exploit for every vulnerable jump in the kernel binary, and a capable compiler is required to implement it. A Linux kernel that implements retpolines is not vulnerable to Spectre even when running on older microcode and maintains complete control of speculative execution. The performance impact has been demonstrated to be far less dramatic than a kernel using the microcode fixes on Intel’s own Clear Linux distribution.

Google (who developed and introduced the concept) has pushed retpolines into Android’s January security advisory and deployed retpoline-enabled kernels throughout its data centers. It likely will be some time before retpolines begin to appear outside of supported Nexus and Pixel devices—LineageOS (known for its speed in pushing patches) has only a single device with a retpoline-enabled kernel at this time.

Unfortunately, neither the UEK nor the RHCK appear to have retpoline support at present. The backporting effort into legacy GCC appears to be an obstacle.

In any case, here are demonstration runs of Lesimple’s script on various architectures. To start is an Oracle Linux system on the first Meltdown-aware RHCK with the first January microcode release. You can see from the RHCK tunables that all patches are active:

# for x in /sys/kernel/debug/x86/*_enabled; do echo "$x:$(<$x)"; done

The script warns that the microcode is not stable, but confirms the system is (mostly) patched—it is not detecting the Spectre V1 fix:

# ./
Spectre and Meltdown mitigation detection tool v0.34 Checking for vulnerabilities on current system
Kernel is Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Wed Jan 3 18:59:47 &rarrhk;PST 2018 x86_64
CPU is Intel(R) Pentium(R) CPU G3220T @ 2.60GHz
We're missing some kernel info (see -v), accuracy might be reduced Hardware check
* Hardware support (CPU microcode) for mitigation techniques * Indirect Branch Restricted Speculation (IBRS) * SPEC_CTRL MSR is available: YES * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit) * Indirect Branch Prediction Barrier (IBPB) * PRED_CMD MSR is available: YES * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit) * Single Thread Indirect Branch Predictors (STIBP) * SPEC_CTRL MSR is available: YES * CPU indicates STIBP capability: YES * Enhanced IBRS (IBRS_ALL) * CPU indicates ARCH_CAPABILITIES MSR availability: NO * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO * CPU explicitly indicates not being vulnerable to Meltdown &rarrhk;(RDCL_NO): NO * CPU microcode is known to cause stability problems: YES (model 60 * stepping 3 ucode 0x23)
The microcode your CPU is running on is known to cause instability &rarrhk;problems, such as intempestive reboots or random crashes.
You are advised to either revert to a previous microcode version &rarrhk;(that might not have the mitigations for Spectre), or &rarrhk;upgrade to a newer one if available. * CPU vulnerability to the three speculative execution attacks variants * Vulnerable to Variant 1: YES * Vulnerable to Variant 2: YES * Vulnerable to Variant 3: YES CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Kernel has array_index_mask_nospec: UNKNOWN (couldn't check (couldn't
* find your kernel image in /boot, if you used netboot, this is normal))
* Checking count of LFENCE instructions following a jump in kernel:
* UNKNOWN (couldn't check (couldn't find your kernel image in /boot, if
* you used netboot, this is normal))
> STATUS: UNKNOWN (Couldn't find kernel image or tools missing to
> execute the checks) CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1 * Kernel is compiled with IBRS/IBPB support: YES * Currently enabled features * IBRS enabled for Kernel space: YES * IBRS enabled for User space: NO * IBPB enabled: YES
* Mitigation 2 * Kernel compiled with retpoline option: UNKNOWN (couldn't read * your kernel configuration) * Kernel compiled with a retpoline-aware compiler: NO * Retpoline enabled: NO
> STATUS: NOT VULNERABLE (IBRS/IBPB are mitigating the vulnerability) CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (PTI mitigates the vulnerability) A false sense of security is worse than no security at all, &rarrhk;see --disclaimer

Running the latest version of the script under the UEK now yields reasonable results. The UEK files are specifically mentioned in the script, with the following repeated commentary: # this kernel has the /sys interface, trust it over everything. Here are the results:

# ./
Spectre and Meltdown mitigation detection tool v0.34 Checking for vulnerabilities on current system
Kernel is Linux 4.1.12-112.14.13.el7uek.x86_64 #2 SMP Thu Jan 18 &rarrhk;11:38:29 PST 2018 x86_64
CPU is Intel(R) Core(TM)2 Duo CPU E4600 @ 2.40GHz Hardware check
* Hardware support (CPU microcode) for mitigation techniques * Indirect Branch Restricted Speculation (IBRS) * SPEC_CTRL MSR is available: NO * CPU indicates IBRS capability: NO * Indirect Branch Prediction Barrier (IBPB) * PRED_CMD MSR is available: NO * CPU indicates IBPB capability: NO * Single Thread Indirect Branch Predictors (STIBP) * SPEC_CTRL MSR is available: NO * CPU indicates STIBP capability: NO * Enhanced IBRS (IBRS_ALL) * CPU indicates ARCH_CAPABILITIES MSR availability: NO * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO * CPU explicitly indicates not being vulnerable to Meltdown &rarrhk;(RDCL_NO): NO * CPU microcode is known to cause stability problems: NO (model 15 * stepping 13 ucode 0xa4)
* CPU vulnerability to the three speculative execution attacks variants * Vulnerable to Variant 1: YES * Vulnerable to Variant 2: YES * Vulnerable to Variant 3: YES CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Mitigated according to the /sys interface: YES (kernel confirms
* that your CPU is unaffected)
* Kernel has array_index_mask_nospec: NO
* Checking count of LFENCE instructions following a jump in kernel: &rarrhk;YES (34
* jump-then-lfence instructions found, which is >= 30 (heuristic))
> STATUS: NOT VULNERABLE (Not affected) CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigated according to the /sys interface: NO (kernel confirms your
* system is vulnerable)
* Mitigation 1 * Kernel is compiled with IBRS/IBPB support: YES * Currently enabled features * IBRS enabled for Kernel space: NO * IBRS enabled for User space: NO * IBPB enabled: UNKNOWN
* Mitigation 2 * Kernel compiled with retpoline option: NO * Kernel compiled with a retpoline-aware compiler: NO * Retpoline enabled: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with
> retpoline are needed to mitigate the vulnerability) CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Mitigated according to the /sys interface: YES (kernel confirms
* that the mitigation is active)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI) A false sense of security is worse than no security at all, &rarrhk;see --disclaimer

Note that the UEK’s Spectre V1 fix is detected via “heuristic” counts of jump-then-lfence instructions; hopefully this is reliable.


It is important for OS kernel vendors to understand that the user community wants these security fixes with a minimum loss of performance and stability. Intel’s microcode updates have failed to meet this requirement, and it now falls upon operating system maintainers to close the exploits. There should be continuing questions pointed at Intel regarding why faulty firmware emerged, why it was slower than retpolines, and how a redesigned microcode solution could improve upon retpoline performance. From the perspective of performance, nothing is literally preferable to Intel’s January efforts.

However, if Red Hat continues to profess that it has “washed its hands” of this issue, there is a clear opening for Oracle to establish a retpoline-enabled UEK as a widely adopted option in the user community. Although one would hope that all vendors are attentive and responsive to their users, the UEK would become a much more visible and formidable option in the RPM world if retpolines were added to its already considerable advantages. Even a (clearly labeled) beta test version would take it far in this regard.

What is certain is that this tale is far from over. The right people have not spoken, and the right words have not been said. Hopefully, it ends quickly and well.


The views and opinions expressed in this article are those of the author and do not necessarily reflect those of Linux Journal.

Original Link

Custom Embedded Linux Distributions

The proliferation of inexpensive IoT boards means the time has come to gain control not only of applications but also the entire software platform. So, how do you build a custom distribution with cross-compiled applications targeted for a specific purpose? As Michael J. Hammel explains here, it’s not as hard as you might think.

Why Go Custom?

In the past, many embedded projects used off-the-shelf distributions and stripped them down to bare essentials for a number of reasons. First, removing unused packages reduced storage requirements. Embedded systems are typically shy of large amounts of storage at boot time, and the storage available, in non-volatile memory, can require copying large amounts of the OS to memory to run. Second, removing unused packages reduced possible attack vectors. There is no sense hanging on to potentially vulnerable packages if you don’t need them. Finally, removing unused packages reduced distribution management overhead. Having dependencies between packages means keeping them in sync if any one package requires an update from the upstream distribution. That can be a validation nightmare.

Yet, starting with an existing distribution and removing packages isn’t as easy as it sounds. Removing one package might break dependencies held by a variety of other packages, and dependencies can change in the upstream distribution management. Additionally, some packages simply cannot be removed without great pain due to their integrated nature within the boot or runtime process. All of this takes control of the platform outside the project and can lead to unexpected delays in development.

A popular alternative is to build a custom distribution using build tools available from an upstream distribution provider. Both Gentoo and Debian provide options for this type of bottom-up build. The most popular of these is probably the Debian debootstrap utility. It retrieves prebuilt core components and allows users to cherry-pick the packages of interest in building their platforms. But, debootstrap originally was only for x86 platforms. Although there are ARM (and possibly other) options now, debootstrap and Gentoo’s catalyst still take dependency management away from the local project.

Some people will argue that letting someone else manage the platform software (like Android) is much easier than doing it yourself. But, those distributions are general-purpose, and when you’re sitting on a lightweight, resource-limited IoT device, you may think twice about any any advantage that is taken out of your hands.

System Bring-Up Primer

A custom Linux distribution requires a number of software components. The first is the toolchain. A toolchain is a collection of tools for compiling software, including (but not limited to) a compiler, linker, binary manipulation tools and standard C library. Toolchains are built specifically for a target hardware device. A toolchain built on an x86 system that is intended for use with a Raspberry Pi is called a cross-toolchain. When working with small embedded devices with limited memory and storage, it’s always best to use a cross-toolchain. Note that even applications written for a specific purpose in a scripted language like JavaScript will need to run on a software platform that needs to be compiled with a cross-toolchain.

Figure 1. Compile Dependencies and Boot Order

The cross-toolchain is used to build software components for the target hardware. The first component needed is a bootloader. When power is applied to a board, the processor (depending on design) attempts to jump to a specific memory location to start running software. That memory location is where a bootloader is stored. Hardware can have a built-in bootloader that can be run directly from its storage location or it may be copied into memory first before it is run. There also can be multiple bootloaders. A first-stage bootloader would reside on the hardware in NAND or NOR flash, for example. Its sole purpose would be to set up the hardware so a second-stage bootloader, such as one stored on an SD card, can be loaded and run.

Bootloaders have enough knowledge to get the hardware to the point where it can load Linux into memory and jump to it, effectively handing control over to Linux. Linux is an operating system. This means that, by design, it doesn’t actually do anything other than monitor the hardware and provide services to higher layer software—aka applications. The Linux kernel often is accompanied by a variety of firmware blobs. These are software objects that have been precompiled, often containing proprietary IP (intellectual property) for devices used with the hardware platform. When building a custom distribution, it may be necessary to acquire any firmware blobs not provided by the Linux kernel source tree before beginning compilation of the kernel.

Applications are stored in the root filesystem. The root filesystem is constructed by compiling and collecting a variety of software libraries, tools, scripts and configuration files. Collectively, these all provide the services, such as network configuration and USB device mounting, required by applications the project will run.

In summary, a complete system build requires the following components:

  1. A cross-toolchain.

  2. One or more bootloaders.

  3. The Linux kernel and associated firmware blobs.

  4. A root filesystem populated with libraries, tools and utilities.

  5. Custom applications.

Start with the Right Tools

The components of the cross-toolchain can be built manually, but it’s a complex process. Fortunately, tools exist that make this process easier. The best of them is probably Crosstool-NG. This project utilizes the same kconfig menu system used by the Linux kernel to configure the bits and pieces of the toolchain. The key to using this tool is finding the correct configuration items for the target platform. This typically includes the following items:

  1. The target architecture, such as ARM or x86.

  2. Endianness: little (typically Intel) or big (typically ARM or others).

  3. CPU type as it’s known to the compiler, such as GCC’s use of either -mcpu or --with-cpu.

  4. The floating point type supported, if any, by the CPU, such as GCC’s use of either -mfpu or --with-fpu.

  5. Specific version information for the binutils package, the C library and the C compiler.

Figure 2. Crosstool-NG Configuration Menu

The first four are typically available from the processor maker’s documentation. It can be hard to find these for relatively new processors, but for the Raspberry Pi or BeagleBoards (and their offspring and off-shoots), you can find the information online at places like the Embedded Linux Wiki.

The versions of the binutils, C library and C compiler are what will separate the toolchain from any others that might be provided from third parties. First, there are multiple providers of each of these things. Linaro provides bleeding-edge versions for newer processor types, while working to merge support into upstream projects like the GNU C Library. Although you can use a variety of providers, you may want to stick to the stock GNU toolchain or the Linaro versions of the same.

Another important selection in Crosstool-NG is the version of the Linux kernel. This selection gets headers for use with various toolchain components, but it is does not have to be the same as the Linux kernel you will boot on the target hardware. It’s important to choose a kernel that is not newer than the target hardware’s kernel. When possible, pick a long-term support kernel that is older than the kernel that will be used on the target hardware.

For most developers new to custom distribution builds, the toolchain build is the most complex process. Fortunately, binary toolchains are available for many target hardware platforms. If building a custom toolchain becomes problematic, search online at places like the Embedded Linux Wiki for links to prebuilt toolchains.

Booting Options

The next component to focus on after the toolchain is the bootloader. A bootloader sets up hardware so it can be used by ever more complex software. A first-stage bootloader is often provided by the target platform maker, burned into on-hardware storage like an EEPROM or NOR flash. The first-stage bootloader will make it possible to boot from, for example, an SD card. The Raspberry Pi has such a bootloader, which makes creating a custom bootloader unnecessary.

Despite that, many projects add a secondary bootloader to perform a variety of tasks. One such task could be to provide a splash animation without using the Linux kernel or userspace tools like plymouth. A more common secondary bootloader task is to make network-based boot or PCI-connected disks available. In those cases, a tertiary bootloader, such as GRUB, may be necessary to get the system running.

Most important, bootloaders load the Linux kernel and start it running. If the first-stage bootloader doesn’t provide a mechanism for passing kernel arguments at boot time, a second-stage bootloader may be necessary.

A number of open-source bootloaders are available. The U-Boot project often is used for ARM platforms like the Raspberry Pi. CoreBoot typically is used for x86 platform like the Chromebook. Bootloaders can be very specific to target hardware. The choice of bootloader will depend on overall project requirements and target hardware (search for lists of open-source bootloaders be online).

Now Bring the Penguin

The bootloader will load the Linux kernel into memory and start it running. Linux is like an extended bootloader: it continues hardware setup and prepares to load higher-level software. The core of the kernel will set up and prepare memory for sharing between applications and hardware, prepare task management to allow multiple applications to run at the same time, initialize hardware components that were not configured by the bootloader or were configured incompletely and begin interfaces for human interaction. The kernel may not be configured to do this on its own, however. It may include an embedded lightweight filesystem, known as the initramfs or initrd, that can be created separately from the kernel to assist in hardware setup.

Another thing the kernel handles is downloading binary blobs, known generically as firmware, to hardware devices. Firmware is pre-compiled object files in formats specific to a particular device that is used to initialize hardware in places that the bootloader and kernel cannot access. Many such firmware objects are available from the Linux kernel source repositories, but many others are available only from specific hardware vendors. Examples of devices that often provide their own firmware include digital TV tuners or WiFi network cards.

Firmware may be loaded from the initramfs or may be loaded after the kernel starts the init process from the root filesystem. However, creating the kernel often will be the process where obtaining firmware will occur when creating a custom Linux distribution.

Lightweight Core Platforms

The last thing the Linux kernel does is to attempt to run a specific program called the init process. This can be named init or linuxrc or the name of the program can be passed to the kernel by the bootloader. The init process is stored in a file system that the kernel can access. In the case of the initramfs, the file system is stored in memory (either by the kernel itself or by the bootloader placing it there). But the initramfs is not typically complete enough to run more complex applications. So another file system, known as the root file system, is required.

Figure 3. Buildroot Configuration Menu

The initramfs filesystem can be built using the Linux kernel itself, but more commonly, it is created using a project called BusyBox. BusyBox combines a collection of GNU utilities, such as grep or awk, into a single binary in order to reduce the size of the filesystem itself. BusyBox often is used to jump-start the root filesystem’s creation.

But, BusyBox is purposely lightweight. It isn’t intended to provide every tool that a target platform will need, and even those it does provide can be feature-reduced. BusyBox has a sister project known as Buildroot, which can be used to get a complete root filesystem, providing a variety of libraries, utilities and scripting languages. Like Crosstool-NG and the Linux kernel, both BusyBox and Buildroot allow custom configuration using the kconfig menu system. More important, the Buildroot system handles dependencies automatically, so selection of a given utility will guarantee that any software it requires also will be built and installed in the root filesystem.

Buildroot can generate a root filesystem archive in a variety of formats. However, it is important to note that the filesystem only is archived. Individual utilities and libraries are not packaged in either Debian or RPM formats. Using Buildroot will generate a root filesystem image, but its contents are not managed packages. Despite this, Buildroot does provide support for both the opkg and rpm package managers. This means custom applications that will be installed on the root filesystem can be package-managed, even if the root filesystem itself is not.

Cross-Compiling and Scripting

One of Buildroot’s features is the ability to generate a staging tree. This directory contains libraries and utilities that can be used to cross-compile other applications. With a staging tree and the cross toolchain, it becomes possible to compile additional applications outside Buildroot on the host system instead of on the target platform. Using rpm or opkg, those applications then can be installed to the root filesystem on the target at runtime using package management software.

Most custom systems are built around the idea of building applications with scripting languages. If scripting is required on the target platform, a variety of choices are available from Buildroot, including Python, PHP, Lua and JavaScript via Node.js. Support also exists for applications requiring encryption using OpenSSL.

What’s Next

The Linux kernel and bootloaders are compiled like most applications. Their build systems are designed to build a specific bit of software. Crosstool-NG and Buildroot are metabuilds. A metabuild is a wrapper build system around a collection of software, each with their own build systems. Alternatives to these include Yocto and OpenEmbedded. The benefit of Buildroot is the ease with which it can be wrapped by an even higher-level metabuild to automate customized Linux distribution builds. Doing this opens the option of pointing Buildroot to project-specific cache repositories. Using cache repositories can speed development and offers snapshot builds without worrying about changes to upstream repositories.

An example implementation of a higher-level build system is PiBox. PiBox is a metabuild wrapped around all of the tools discussed in this article. Its purpose is to add a common GNU Make target construction around all the tools in order to produce a core platform on which additional software can be built and distributed. The PiBox Media Center and kiosk projects are implementations of application-layer software installed on top of the core platform to produce a purpose-built platform. The Iron Man project is intended to extend these applications for home automation, integrated with voice control and IoT management.

But PiBox is nothing without these core software tools and could never run without an in-depth understanding of a complete custom distribution build process. And, PiBox could not exist without the long-term dedication of the teams of developers for these projects who have made custom-distribution-building a task for the masses.

Original Link

Kernel 4.16-rc1, Qubes OS 4.0, OpenSUSE’s Tumbleweed and More

News updates for February 1, 2018.

Today Linux kernel 4.16-rc1 introduces three new driver subsystems and the VirtualBox Guest driver. See the pull request for the complete list of changes.

Qubes OS 4.0-rc4 has been released, which “contains important safeguards against the Spectre and Meltdown attacks, as well as bug fixes for many of the issues discovered in the previous release candidate.” See the release notes for more details.

Cisco announced its new Cisco Container Platform yesterday, which “simplifies and accelerates how application development and information technology (IT) operations teams configure, deploy, and manage container clusters based on 100 percent upstream Kubernetes.”

openSUSE’s rolling release Tumbleweed had six new software snapshots released this week, which included an update of GCC to 7.3 and firewalld replacing SuSEFirewall2.

The winners of the Google Code-in 2017 were announced yesterday on the Google Open Source Blog.

Original Link

KDE Plasma 5.12, Btrfs Improvement, Linux Support for Wacom SmartPad Devices and More

News updates for January 30, 2018.

Interested in giving KDE Plasma 5.12 LTS Desktop a spin (currently in beta)? Look no further than the latest snapshot releases of OpenSUSE Tumbleweed.

Speaking of KDE Plasma, you now can test the mobile port in a VirtualBox or KVM virtual machine. ISO images are available here.

In Linux kernel-related news, expect to see a ton of Btrfs improvement scheduled for the 4.16 release. This even includes better RAID 5/6 support.

Two developers from Red Hat, Peter Hutterer and Benjamin Tissoires, have announced the Tuhi project, a dæmon providing Linux support for Wacom SmartPad devices like the Bamboo Slate and Bamboo Spark.

SentinelOne recently released a free Linux tool called Blacksmith that detects the Meltdown exploit: “this tool detects the attempted exploitation of Meltdown vulnerability on all Linux systems, empowering Linux admins to stop attacks before they take root”.

Thanks to Petros Koutoupis for his contributions to this article.

Original Link

Linux 4.15 Kernel, GCC, LinuxBoot Project and More Cryptojacking

News briefs for January 29, 2018.

The good: the Linux 4.15 kernel officially has been released. View the diff here, and also see the Linux Kernel Archives for more info.

The bad: more work needs to be done to handle Spectre/Meltdown security vulnerabilities. Linus makes it clear that compiler updates will need to work alongside the kernel ones to help mitigate these issues.

The ugly: unless you are running GCC 7.3 or later, you are still not in the clear.

The Linux Foundation recently announced the new LinuxBoot Project, which “looks to improve system boot performance and reliability by replacing some firmware functionality with a Linux kernel and runtime.”

YouTube recently was caught serving ads with cryptojacking malware, as reported by Ars Technica and others.

Thanks to Petros Koutoupis for his contributions to this article.

Original Link

diff -u: Complexifying printk()

What’s new in kernel development: complexifying printk().

It’s so simple! The kernel decides to output a log message, so it calls printk() to send the message to a serial console—except it’s not simple at all. What if the kernel is in the middle of crashing, and the log message is the crucial clue needed to diagnose the problem? How do you output a log message when you don’t know what parts of the system you even can rely on? What if the system’s out of memory or trapped in an atomic context, unable to switch from whatever’s breaking to the code to execute the printk()?

There are all sorts of corner cases that safely can be ignored by user code producing output, but that are essential to get right when the kernel is the one producing output.

To make matters worse, these corner cases tend to occur in ways that are difficult to reproduce, creating potential controversy over whether a bug exists at all. How do you reproduce a bug that causes the very logging system to fail to tell you what happened?

A multi-year debate recently hit the kernel mailing list again, as Sergey Senozhatsky posted a patch to fix a system crash he and others had been seeing in their companies’ data centers. Unfortunately, his solution added a lot of complexity to the already complicated printk() code, and the bug it fixed could not be reproduced on demand.

Steven Rostedt had a separate patch for printk(), also very complicated, but much simpler than Sergey’s patch. The only problem was, it didn’t address the issue Sergey was trying to fix. So, much of the email thread consisted of Sergey and others trying to convince Steven and others that the bug was real. They posted process traces and logic paths, but Steven was never convinced. In particular, in the cases where Sergey’s side was able to demonstrate an actual problem, Steven objected that the use-case was completely unrealistic—for example, when a single CPU was doing all the work while all the other CPUs on the system remained idle.

Meanwhile, a significant number of people simply wanted to push Steven’s code into the mainline kernel source and see what happened. Maybe Sergey’s system crashes would stop? But, just throwing something into the wild like that is never an appealing option, and it seemed to grow out of the fact that it’s so difficult to know what printk() is actually doing when things go wrong.

There was absolutely no conclusion during the discussion. The multi-year debate remains a multi-year debate. However, there may have been a slight nudge in one direction or another, because Steven has started to feel that the system crashes Sergey is trying to fix may in fact be real, but caused by something else, and he’s begun working with Sergey to try to diagnose that.

It’s all very interesting, because even though the developers experience a lot of frustration with each other in a situation like this, they still keep working together, trying to find a way through. There’s no way to know for sure how it will end up. Maybe someone will find a way to explode the entire printk() code into a bunch of smaller yet simpler elements, so that bugs can be reproduced and analyzed, instead of remaining intractable.

Original Link

diff -u: in-Kernel DRM Support

A look at what’s new in kernel development.

Welcome to the new diff -u! We’re experimenting with a shorter, more frequent, single-subject format for this feature, which also may evolve over time. Let us know what you think in the comments below.

Recently there’s been an effort to add support for digital rights management (DRM) into the Linux kernel. The goal of DRM is to prevent users from making copies of music, video and other media that they watch on their own computers, but it also poses fundamental questions about the nature and fate of general-purpose computers.

Sean Paul, from the ChromeOS developer team, submitted a patch to enable DRM encryption running through certain pieces of DRM hardware, including exynos, mediatek and rock chip. The patch itself is not new—ChromeOS has been using it in-house for years. However, if these DRM patches could get into the official kernel tree, any Linux system running on the proper hardware—not just ChromeOS systems—could support DRM controls.

The code was highly targeted to make it through the gauntlet of kernel patch submission. It didn’t go so far as to implement features that would take control away from the user. All it did was implement encryption via High-bandwidth Digital Content Protection (HDCP) and allow the user to turn on and off the hardware that would use the encrypted HDCP data stream.

In other words, the patch theoretically implemented just a general-purpose cryptographic feature that might be used for something other than DRM. And as Daniel Vetter put it in the mailing list discussion, any full DRM implementation also would need an unlockable boot-loader, as well as a variety of userspace code. At that point, he said, “yes, then you don’t really own the machine fully.”

Pavel Machek didn’t like this at all and felt that accepting even Sean’s relatively generic patch would encourage hardware vendors to install locked-down versions of Linux at the factory, thus preventing users from altering the OS themselves. He added, “That is evil, and a direct threat to Free Software movement.”

Pavel also pointed out that any normal user who was not on a vendor-controlled, DRM-locked system, would get no benefit whatsoever from Sean’s patch. On an unlocked system, there was simply no reason to enable the feature at all.

And so, even though the patch only enables features that are already present in a system’s hardware, and even though as implemented it would be an optional kernel feature, there is already strong opposition to it because of the threat that such a patch might undermine the future availability of user-controlled computers and the likely proliferation of Linux systems that violate the spirit, if not the letter, of the GPL.

But in spite of this threat, it’s still possible that this patch, or something similar, could go into the kernel. As Alan Cox pointed out during the discussion, the code simply needs to implement a feature that has a general-purpose use. If the code’s only value is to lock down systems, Linus Torvalds would be unlikely ever to accept it. But, if the hardware has some other legitimate purpose, and if the encryption keys are held by the actual user and not the vendor, a patch to enable it could successfully make it through.

Original Link

Spectre Patches, Snap, Happy Birthday LWN and More

News updates for January 22, 2018.

Are you using protection? Longtime kernel developer, Greg Kroah-Hartman, just posted a simple recipe for users to verify whether they are running a Spectre/Meltdown patched version of the Linux kernel.

Speaking of Spectre patches, Richard Biener made release candidate GCC 7.3 available today, and if all goes well, he plans for it to be an official release on Thursday, January 25, 2018.

Recently we shared news that the 4.16 kernel will start seeing some VirtualBox guest addition code, but that isn’t the only thing being added to the kernel. A recent pull request is making its way into RC1 and will be adding a lot of updates to the existing sound drivers alongside new hardware support.

Late last week Canonical announced Slack as a snap, making it available across Linux platforms. It’s in beta, but you can download it here.

And last but certainly not least, we’d like to send out a big Happy 20th Birthday to LWN!

Thanks to Petros Koutoupis for his contributions to this article.

Original Link

New Kernel Releases, Net Neutrality, Thunderbird Survey and More

News roundup for January 17, 2018.

Hot off the presses and just released: the 4.14.14 [stable], 4.9.77 [longterm], 4.4.112 [longterm] and 3.18.92 [longterm] kernels. More information is available from The Linux Kernel Archives.

In an effort to protect Net Neutrality (and the internet), Mozilla filed a petition in federal court yesterday against the FCC. The idea behind Net Neutrality is to treat all internet traffic equally and without discrimination against content or type.

Make your opinions heard: Monterail and the Thunderbird email client development team are asking for your assistance to help improve the user interface in the redesign of the Thunderbird application. Be sure to take the survey.

In a recent Debconf presentation, Google announced that it will be replacing its internally used Goobuntu development platform (the latest of which is based on Ubuntu 14.04 LTS) with something called gLinux (using Debian 10 Buster). You can watch the presentation here (approximately 12 minutes in).

In the very near future, some VirtualBox users may not need to install the application’s guest additions to enable more seamless and shared functionality. Currently making its way through the Linux kernel mailing list is a series of patches to enable folder sharing support out of box, which may appear as early as 4.16.

Original Link

Linux kernel mailing list back online; Meltdown and Spectre vulnerabilities; Mobile OS eelo; Barcelona now using Linux

News digest for January, 15, 2018.

Just released on 1-14-2018: the 4.15-rc8 Linux kernel. You can view the commit diff here, and more information is available from The Linux Kernel Archives.

The popular Linux Kernel Mailing List website is back online after going down and staying down for several days due to a power outage to the home server where it was hosted. Upon reboot, a password (for dm-crypt) was required to mount the root device; however, that in itself was not the problem. The problem was the fact that the PC’s owner, Jasper, was on vacation when all of this occurred. Anyway, the site is now back up and continuing to operate as it always has.

Speaking of the kernel mailing lists, Johannes Weiner issued a call for proposals for agenda topics to the upcoming annual 2018 Linux Storage, Filesystem and Memory Management (LSF/MM) Summit. The deadline is January 31, 2018, and the summit will be held between April 23-25 At Deer Valley Lodges in Park City, Utah. For more information, visit the Linux Foundation Events page.

Major Linux distributions are continuing to work hard pushing updates and patches to the Meltdown and Spectre vulnerabilities. If you are unfamiliar with these two very critical issues, please familiarize yourself with this link, and be sure to update your distribution ASAP. Officially, the fixes will be back-ported to the 4.14, 4.9 and 4.4 long-term supported Linux kernels, although some of the major distributions will continue to back-port the fixes to some of the much older kernels used in their respective long-term supported operating system releases. You can follow the various distributions’ feeds to get the latest status on this.

With its goal already reached, the Kickstarter campaign to fund the open-source mobile operating system, eelo, has four days left before concluding. Eelo is a resin and, in some cases, a re-engineered implementation of Android, with an emphasis on user privacy.

Late last week, the city of Barcelona announced that it will be replacing its existing Microsoft systems and proprietary software with Linux and open-source software. They are in the process of migrating as we speak. The primary aim of the city was to avoid spending a large sum of money on licensing costs and services and be less dependent to single vendors and suppliers.

Original Link