November 23, 2020

hackergotchi for Shirish Agarwal

Shirish Agarwal

White Hat Senior and Education

I had been thinking of doing a blog post on RCEP which China signed with 14 countries a week and a day back but this new story has broken and is being viraled a bit on the interwebs, especially twitter and is pretty much in our domain so thought would be better to do a blog post about it. Also there is quite a lot packed so quite a bit of unpacking to do.

Whitehat, Greyhat and Blackhat

For those of you who may not, there are actually three terms especially in computer science that one comes across. Those are white hats, grey hats and black hats. Now clinically white hats are like the fiery angels or the good guys who basically take permissions to try and find out weakness in an application, program, website, organization and so on and so forth. A somewhat dated reference to hacker could be Sandra Bullock (The Net 1995) , Sneakers (1992), Live Free or Die Hard (2007) . Of the three one could argue that Sandra was actually into viruses which are part of computer security but still she showed some bad-ass skills, but then that is what actors are paid to do 🙂 Sneakers was much more interesting for me because in that you got the best key which can unlock any lock, something like quantum computing is supposed to do. One could equate both the first movies in either white hat or grey hat . A grey hat is more flexible in his/her moral values and they are plenty of such people. For e.g. Julius Assange could be described as a grey hat, but as you can see and understand those are moral issues.

A black hat on the other hand is one who does things for profit even if it harms the others. The easiest fictitious examples are all Die Hard series, all of them except the 4th one, all had bad guys or black hats. The 4th one is the odd one out as it had Matthew Farell (Justin Long) as a grey hat hacker. In real life Kevin Mitnick, Kevin Poulsen, Robert Tappan Morris, George Hotz, Gary McKinnon are some examples of hackers, most of whom were black hats, most of them reformed into white hats and security specialists. There are many other groups and names but that perhaps is best for another day altogether.

Now why am I sharing this. Because in all of the above, the people who are using and working with the systems have better than average understanding of systems and they arguably would be better than most people at securing their networks, systems etc. but as we shall see in this case there has been lots of issues in the company.

WhiteHat Jr. and 300 Million Dollars

Before I start this, I would like to share that for me this suit in many ways seems to be similar to the suit filed against Krishnaraj Rao . Although the difference is that Krishnaraj Rao’s case/suit is that it was in real estate while this one is in ‘education’ although many things are similar to those cases but also differ in some obvious ways. For e.g. in the suit against Krishnaraj Rao, the plaintiff’s first approached the High Court and then the Supreme Court. Of course Krishanraj Rao won in the High Court and then in the SC plaintiff’s agreed to Krishnaraj Rao’s demands as they knew they could not win in SC. In that case, a compromise was reached by the plaintiff just before judgement was to be delivered.

In this case, the plaintiff have directly come to the SC, short-circuiting the high court process. This seems to be a new trend with the current Government in power where the rich get to be in SC without having to go the Honorable HC . It says much about SC as well, as they entertained the application and didn’t ask the plaintiff to go to the lower court first as should have been the case but that is and was honorable SC’s right to decide . The charges against Pradeep Poonia (the defendant in this case) are very much similar to those which were made in Krishanraj Rao’s suit hence won’t be going into those details. They have claimed defamation and filed a 20 crore suit. The idea is basically to silence any whistle-blowers.

Fictional Character Wolf Gupta

The first issue in this case or perhaps one of the most famous or infamous character is an unknown. While he has been reportedly hired by Google India, BJYU, Chandigarh. This has been reported by Yahoo News. I did a cursory search on LinkedIn to see if there indeed is a wolf gupta but wasn’t able to find any person with such a name. I am not even talking the amount of money/salary the fictitious gentleman is supposed to have got and the various variations on the salary figures at different times and the different ads.

If I wanted to, I could have asked few of the kind souls whom I know are working in Google to see if they can find such a person using their own credentials but it probably would have been a waste of time. When you show a LinkedIn profile in your social media, it should come up in the results, in this case it doesn’t. I also tried to find out if somehow BJYU was a partner to Google and came up empty there as well. There is another story done by Kan India but as I’m not a subscriber, I don’t know what they have written but the beginning of the story itself does not bode well.

While I can understand marketing, there is a line between marketing something and being misleading. At least to me, all of the references shared seems misleading at least to me.

Taking down dissent

One of the big no-nos at least from what I perceive, you cannot and should not take down dissent or critique. Indians, like most people elsewhere around the world, critique and criticize day and night. Social media like twitter, mastodon and many others would not exist in the place if criticisms are not there. In fact, one could argue that Twitter and most social media is used to drive engagements to a person, brand etc. It is even an official policy in Twitter. Now you can’t drive engagements without also being open to critique and this is true of all the web, including WordPress and me 🙂 . What has been happening is that whitehatjr with help of bjyu have been taking out content of people citing copyright violation which seems laughable.

When citizens critique anything, we are obviously going to take the name of the product otherwise people would have to start using new names similar to how Tom Riddle was known as ‘Dark Lord’ , ‘Voldemort’ and ‘He who shall not be named’ . There have been quite a few takedowns, I just provide one for reference, the rest of the takedowns would probably come in the ongoing suit/case.

Whitehat Jr. ad showing investors fighting


Now a brief synopsis of what the ad. is about. The ad is about a kid named ‘Chintu’ who makes an app. The app. is so good that investors come to his house and right in the lawn and start fighting each other. The parents are enjoying looking at the fight and to add to the whole thing there is also a nosy neighbor who has his own observations. Simply speaking, it is a juvenile ad but it works as most parents in India, as elsewhere are insecure.

Jihan critiquing the whitehatjr ad

Before starting, let me assure that I asked Jihan’s parents if it’s ok to share his ad on my blog and they agreed. What he has done is broken down the ad and showed how juvenile the ad is and using logic and humor as a template for the same. He does make sure to state that he does not know how the product is as he hasn’t used it. His critique was about the ad and not the product as he hasn’t used that.

The Website

If you look at the website, sadly, most of the site only talks about itself rather than giving examples that people can look in detail. For e.g. they say they have few apps. on Google play-store but no link to confirm the same. The same is true of quite a few other things. In another ad a Paralympic star says don’t get into sports and get into coding. Which athlete in their right mind would say that. And it isn’t that we (India) are brimming with athletes at the international level. In the last outing which was had in 2016, India sent a stunning 117 athletes but that was an exception as we had the women’s hockey squad which was of 16 women, and even then they were overshadowed in numbers by the bureaucratic and support staff. There was criticism about the staff bit but that is probably a story for another date.

Most of the site doesn’t really give much value and the point seems to be driving sales to their courses. This is pressurizing small kids as well as teenagers and better who are in the second and third year science-engineering whose parents don’t get that it is advertising and it is fake and think that their kids are incompetent. So this pressurizes both small kids as well as those who are learning, doing in whatever college or educational institution . The teenagers more often than not are unable to tell/share with them that this is advertising and fake. Also most of us have been on a a good diet of ads. Fair and lovely still sells even though we know it doesn’t work.

This does remind me of a similar fake academy which used very much similar symptoms and now nobody remembers them today. There used to be an academy called Wings Academy or some similar name. They used to advertise that you come to us and we will make you into a pilot or an airhostess and it was only much later that it was found out that most kids were doing laundry work in hotels and other such work. Many had taken loans, went bankrupt and even committed suicide because they were unable to pay off the loans due to the dreams given by the company and the harsh realities that awaited them. They were sued in court but dunno what happened but soon they were off the radar so we never came to know what happened to those million of kids whose life dreams were shattered.

Security

Now comes the security part. They have alleged that Pradeep Poonia broke into their systems. While this may be true, what I find funny is that with the name whitehat, how can they justify it. If you are saying you are white hat you are supposed to be much better than this. And while I have not tried to penetrate their systems, I did find it laughable that the site is using an expired https:// certificate. I could have tried further to figure out the systems but I chose not to . How they could not have an automated script to do the same is beyond me. But that is their concern, not mine.

Comparison

A similar offering would be unacademy but as can be seen they neither try to push you in anyway and nor do they make any ridiculous claims. In fact how genuine unacademy is can be gauged from the fact that many of its learning resources are available to people to see on YT and if they have tools they can also download it. Now, does this mean that every educational website should have their content for free, of course not. But when a channel has 80% – 90% of it YT content as ads and testimonials then they surely should give a reason to pause both for parents and students alike. But if parents had done that much research, then things would not be where they are now.

Allegations

Just to complete, there are allegations by Pradeep Poornia with some screenshots which show the company has been doing lot of bad things. For e.g. they were harassing an employee at night 2 a.m. who was frustrated and working in the company at the time. Many of the company staff routinely made sexist and offensive, sexual abusive remarks privately between themselves for prospective women who came to interview via webcam (due to the pandemic). There also seems to be a bit of porn on the web/mobile server of the company as well. There also have been allegations that while the company says refund is done next day, many parents who have demanded those refunds have not got it. Now while Pradeep has shared some of the quotations of the staff while hiding the identities of both the victims and the perpetrators, the language being used in itself tells a lot. I am in two minds whether to share those photos or not hence atm choosing not to. Poornia has also contended that all teachers do not know programming and they are given scripts to share. There have been some people who did share that experience with him –

Suruchi Sethi

From the company’s side they are alleging he has hacked the company servers and would probably be using the Fruit of the poisonous tree argument which we have seen have been used in many arguments.

Conclusion

Now that lies in the eyes of the Court whether the single bench choses the literal meaning or use the spirit of the law or the genuine concerns of the people concerned. While in today’s hearing while the company asked for a complete sweeping injunction they were unable to get it. Whatever may happen, we may hope to see some fireworks in the second hearing which is slated to be on 6.01.2021 where all of this plays out. Till later.

23 November, 2020 07:22PM by shirishag75

Vincent Fourmond

QSoas tips and tricks: using meta-data, first level

By essence, QSoas works with \(y = f(x)\) datasets. However, in practice, when working with experimental data (or data generated from simulations), one has often more than one experimental parameter (\(x\)). For instance, one could record series of spectra (\(A = f(\lambda)\)) for different pH values, so that the absorbance is in fact a function of both the pH and \(\lambda\). QSoas has different ways to deal with such situations, and we'll describe one today, using meta-data.

Setting meta-data

Meta-data are simply series of name/values attached to a dataset. It can be numbers, dates or just text. Some of these are automatically detected from certain type of data files (but that is the topic for another day). The simplest way to set meta-data is to use the set-meta command:
QSoas> set-meta pH 7.5
This command sets the meta-data pH to the value 7.5. Keep in mind that QSoas does not know anything about the meaning of the meta-data[1]. It can keep track of the meta-data you give, and manipulate them, but it will not interpret them for you. You can set several meta-data by repeating calls to set-meta, and you can display the meta-data attached to a dataset using the command show. Here is an example:
QSoas> generate-buffer 0 10
QSoas> set-meta pH 7.5
QSoas> set-meta sample "My sample"
QSoas> show 0
Dataset generated.dat: 2 cols, 1000 rows, 1 segments, #0
Flags: 
Meta-data:	pH =	 7.5	sample =	 My sample
Note here the use of quotes around My sample since there is a space inside the value.

Using meta-data

There are many ways to use meta-data in QSoas. In this post, we will discuss just one: using meta-data in the output file. The output file can collect data from several commands, like peak data, statistics and so on. For instance, each time the command 1 is run, a line with the information about the largest peak of the current dataset is written to the output file. It is possible to automatically add meta-data to those lines by using the /meta= option of the output command. Just listing the names of the meta-data will add them to each line of the output file.

As a full example, we'll see how one can take advantage of meta-data to determine the position of the peak of the function \(x^2 \exp (-a\,x)\) depends on \(a\). For that, we first create a script that generates the function for a certain value of \(a\), sets the meta-data a to the corresponding value, and find the peak. Let's call this file do-one.cmds (all the script files can be found in the GitHub repository):
generate-buffer 0 20 x**2*exp(-x*${1})
set-meta a ${1}
1 
This script takes a single argument, the value of \(a\), generates the appropriate dataset, sets the meta-data a and writes the data about the largest (and only in this case) peak to the output file. Let's now run this script with 1 as an argument:
QSoas> @ do-one.cmds 1
This command generates a file out.dat containing the following data:
## buffer       what    x       y       index   width   left_width      right_width     area
generated.dat   max     2.002002002     0.541340590883  100     3.4034034034    1.24124124124   2.162162162161.99999908761
This gives various information about the peak found: the name of the dataset it was found in, whether it's a maximum or minimum, the x and y positions of the peak, the index in the file, the widths of the peak and its area. We are interested here mainly in the x position.

Then, we just run this script for several values of \(a\) using run-for-each, and in particular the option /range-type=lin that makes it interpret values like 0.5..5:80 as 80 values evenly spread between 0.5 and 5. The script is called run-all.cmds:
output peaks.dat /overwrite=true /meta=a
run-for-each do-one.cmds /range-type=lin 0.5..5:80
V all /style=red-to-blue
The first line sets up the output to the output file peaks.dat. The option /meta=a makes sure the meta a is added to each line of the output file, and /overwrite=true make sure the file is overwritten just before the first data is written to it, in order to avoid accumulating the results of different runs of the script. The last line just displays all the curves with a color gradient. It looks like this:
Running this script (with @ run-all.cmds) creates a new file peaks.dat, whose first line looks like this:
## buffer       what    x       y       index   width   left_width      right_width     area    a
The column x (the 3rd) contains the position of the peaks, and the column a (the 10th) contains the meta a (this column wasn't present in the output we described above, because we had not used yet the output /meta=a command). Therefore, to load the peak position as a function of a, one has just to run:
QSoas> load peaks.dat /columns=10,3
This looks like this:
Et voilà !

To train further, you can:
  • improve the resolution in x;
  • improve the resolution in y;
  • plot the magnitude of the peak;
  • extend the range;
  • derive the analytical formula for the position of the peak and verify it !

[1] this is not exactly true. For instance, some commands like unwrap interpret the sr meta-data as a voltammetric scan rate if it is present. But this is the exception.

About QSoas

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code there (or clone from the GitHub repository) and compile it yourself, or buy precompiled versions for MacOS and Windows there.

23 November, 2020 06:55PM by Vincent Fourmond (noreply@blogger.com)

November 22, 2020

François Marier

Removing a corrupted data pack in a Restic backup

I recently ran into a corrupted data pack in a Restic backup on my GnuBee. It led to consistent failures during the prune operation:

incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
hash does not match id: want 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5, got 2818331716e8a5dd64a610d1a4f85c970fd8ae92f891d64625beaaa6072e1b84
github.com/restic/restic/internal/repository.Repack
        github.com/restic/restic/internal/repository/repack.go:37
main.pruneRepository
        github.com/restic/restic/cmd/restic/cmd_prune.go:242
main.runPrune
        github.com/restic/restic/cmd/restic/cmd_prune.go:62
main.glob..func19
        github.com/restic/restic/cmd/restic/cmd_prune.go:27
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/cobra/command.go:838
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/cobra/command.go:943
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/cobra/command.go:883
main.main
        github.com/restic/restic/cmd/restic/main.go:86
runtime.main
        runtime/proc.go:204
runtime.goexit
        runtime/asm_amd64.s:1374

Thanks to the excellent support forum, I was able to resolve this issue by dropping a single snapshot.

First, I identified the snapshot which contained the offending pack:

$ restic -r sftp:hostname.local: find --pack 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5
repository b0b0516c opened successfully, password is correct
Found blob 2beffa460d4e8ca4ee6bf56df279d1a858824f5cf6edc41a394499510aa5af9e
 ... in file /home/francois/.local/share/akregator/Archive/http___udd.debian.org_dmd_feed_
     (tree 602b373abedca01f0b007fea17aa5ad2c8f4d11f1786dd06574068bf41e32020)
 ... in snapshot 5535dc9d (2020-06-30 08:34:41)

Then, I could simply drop that snapshot:

$ restic -r sftp:hostname.local: forget 5535dc9d
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  1 / 1 files deleted

and run the prune command to remove the snapshot, as well as the incomplete packs that were also mentioned in the above output but could never be removed due to the other error:

$ restic -r sftp:hostname.local: prune
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[20:11] 100.00%  77439 / 77439 packs
incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
repository contains 77434 packs (2384522 blobs) with 367.648 GiB
processed 2384522 blobs: 1165510 duplicate blobs, 47.331 GiB duplicate
load all snapshots
find data that is still in use for 15 snapshots
[1:11] 100.00%  15 / 15 snapshots
found 1006062 of 2384522 data blobs still in use, removing 1378460 blobs
will remove 5 invalid files
will delete 13728 packs and rewrite 15140 packs, this frees 142.285 GiB
[4:58:20] 100.00%  15140 / 15140 packs rewritten
counting files in repo
[18:58] 100.00%  50164 / 50164 packs
finding old index files
saved new indexes as [340cb68f 91ff77ef ee21a086 3e5fa853 084b5d4b 3b8d5b7a d5c385b4 5eff0be3 2cebb212 5e0d9244 29a36849 8251dcee 85db6fa2 29ed23f6 fb306aba 6ee289eb 0a74829d]
remove 190 old index files
[0:00] 100.00%  190 / 190 files deleted
remove 28868 old packs
[1:23] 100.00%  28868 / 28868 files deleted
done

22 November, 2020 07:28PM

Molly de Blanc

Why should you work on free software (or other technology issues)?

Twice this week I was asked how it can be okay to work on free software when there are issues like climate change and racial injustice. I have a few answers for that.

You can work on injustice while working on free software.

A world in which all technology is just cannot exist under capitalism. It cannot exist under racism or sexism or ableism. It cannot exist in a world that does not exist if we are ravaged by the effects of climate change. At the same time, free software is part of the story of each of these. The modern technology state fuels capitalism, and capitalism fuels it. It cannot exist without transparency at all levels of the creation process. Proprietary software and algorithms reinforce racial and gender injustice. Technology is very guilty of its contributions to the climate crisis. By working on making technology more just, by making it more free, we are working to address these issues. Software makes the world work, and oppressive software creates an oppressive world.

You can work on free software while working on injustice.

Let’s say you do want to devote your time to working on climate justice full time. Activism doesn’t have to only happen in the streets or in legislative buildings. Being a body in a protest is activism, and so is running servers for your community’s federated social network, providing wiki support, developing custom software, and otherwise bringing your free software skills into new environments. As long as your work is being accomplished under an ethos of free software, with free software, and under free software licenses, you’re working on free software issues while saving the world in other ways too!

Not everyone needs to work on everything all the time.

When your house in on fire, you need to put out the fire. However, maybe you can’t help put out the first. Maybe You don’t have the skills or knowledge or physical ability. Maybe your house is on fire, but there’s also an earthquake and a meteor and a airborn toxic event all coming at once. When that happens, we have to split up our efforts and that’s okay.

22 November, 2020 05:41PM by mollydb

Arturo Borrero González

How to use nftables from python

Netfilter logo

One of the most interesting (and possibly unknown) features of the nftables framework is the native python interface, which allows python programs to access all nft features programmatically, from the source code.

There is a high-level library, libnftables, which is responsible for translating the human-readable syntax from the nft binary into low-level expressions that the nf_tables kernel subsystem can run. The nft command line utility basically wraps this library, where all actual nftables logic lives. You can only imagine how powerful this library is. Originally written in C, ctypes is used to allow native wrapping of the shared lib object using pure python.

To use nftables in your python script or program, first you have to install the libnftables library and the python bindings. In Debian systems, installing the python3-nftables package should be enough to have everything ready to go.

To interact with libnftables you have 2 options, either use the standard nft syntax or the JSON format. The standard format allows you to send commands exactly like you would do using the nft binary. That format is intended for humans and it doesn’t make a lot of sense in a programmatic interaction. Whereas JSON is pretty convenient, specially in a python environment, where there are direct data structure equivalents.

The following code snippet gives you an example of how easy this is to use:

#!/usr/bin/env python3

import nftables
import json

nft = nftables.Nftables()
nft.set_json_output(True)
rc, output, error = nft.cmd("list ruleset")
print(json.loads(output))

This is functionally equivalent to running nft -j list ruleset. Basically, all you have to do in your python code is:

  • import the nftables & json modules
  • init the libnftables instance
  • configure library behavior
  • run commands and parse the output (ideally using JSON)

The key here is to use the JSON format. It allows adding ruleset modification in batches, i.e. to create tables, chains, rules, sets, stateful counters, etc in a single atomic transaction, which is the proper way to update firewalling and NAT policies in the kernel and to avoid inconsistent intermediate states.

The JSON schema is pretty well documented in the libnftables-json(5) manpage. The following example is copy/pasted from there, and illustrates the basic idea behind the JSON format. The structure accepts an arbitrary amount of commands which are interpreted in order of appearance. For instance, the following standard syntax input:

flush ruleset
add table inet mytable
add chain inet mytable mychain
add rule inet mytable mychain tcp dport 22 accept

Translates into JSON as such:

{ "nftables": [
    { "flush": { "ruleset": null }},
    { "add": { "table": {
        "family": "inet",
        "name": "mytable"
    }}},
    { "add": { "chain": {
        "family": "inet",
        "table": "mytable",
        "chain": "mychain"
    }}}
    { "add": { "rule": {
        "family": "inet",
        "table": "mytable",
        "chain": "mychain",
        "expr": [
            { "match": {
                "left": { "payload": {
                    "protocol": "tcp",
                    "field": "dport"
                }},
                "right": 22
            }},
            { "accept": null }
        ]
    }}}
]}

I encourage you to take a look at the manpage if you want to know about how powerful this interface is. I’ve created a git repository to host several source code examples using different features of the library: https://github.com/aborrero/python-nftables-tutorial. I plan to introduce more code examples as I learn and create them.

There are several relevant projects out there using this nftables python integration already. One of the most important pieces of software is firewalld. They started using the JSON format back in 2019.

In the past, people interacting with iptables programmatically would either call the iptables binary directly or, in the case of some C programs, hack libiptc/libxtables libraries into their source code. The native python approach to use libnftables is a huge step forward, which should come handy for developers, network engineers, integrators and other folks using the nftables framework in a pythonic environment.

If you are interested to know how this python binding works, I invite you to take a look at the upstream source code, nftables.py, which contains all the magic behind the scenes.

22 November, 2020 05:08PM

hackergotchi for Markus Koschany

Markus Koschany

My Free Software Activities in October 2020

Welcome to gambaru.de. Here is my monthly report (+ the first week in November) that covers what I have been doing for Debian. If you’re interested in Java, Games and LTS topics, this might be interesting for you.

Debian Games

  • I released a new version of debian-games, a collection of metapackages for games. As expected the Python 2 removal takes its toll on games in Debian that depend on pygame or other Python 2 libraries. Currently we have lost more games in 2020 than could be newly introduced to the archive. All in all it could be better but also a lot worse.
  • New upstream releases were packaged for freeorion and xaos.
  • Most of the time was spent on upgrading the bullet physics library to version 3.06, testing all reverse-dependencies and requesting a transition for it. (#972395) Similar to bullet I also updated box2d, the 2D counterpart. The only reverse-dependency, caveexpress fails to build from source with box2d 2.4.1, so unless I can fix it, it doesn’t make much sense to upload the package to unstable.
  • Some package polishing: I could fix two bugs in stormbaancoureur, patch by Helmut Grohne, and ardentryst that required a dependency on python3-future to start.
  • I sponsored mgba and pekka-kana-2 for Ryan Tandy and Carlos Donizete Froes
  • and started to work on porting childsplay to Python 3.
  • Finally I did a NMU for bygfoot to work around a GCC 10 FTBFS.

Debian Java

pdfsam
  • I uploaded pdfsam and its related sejda libraries to unstable and applied an upstream patch to fix an error with Debian’s jackson-jr version. Everything should be usable and up-to-date now.
  • I updated mina2 and investigated a related build failure in apache-directory-server, packaged a new upstream release of commons-io and undertow and fixed a security vulnerability in junit4 by upgrading to version 4.13.1.
  • The upgrade of jflex to version 1.8.2 took a while. The package is available in experimental now but regression tests with ratt showed, that several reverse-dependencies FTBFS with 1.8.2. Since all of these projects work fine with 1.7.0, I intend to postpone the upload to unstable. No need to break something.

Misc

  • This month also saw new upstream versions of wabt and binaryen.
  • I intend to update ublock-origin in Buster but I haven’t heard back from the release team yet. (#973695)

Debian LTS

This was my 56. month as a paid contributor and I have been paid to work 20,75 hours on Debian LTS, a project started by Raphaël Hertzog. In that time I did the following:

  • DLA-2440-1. Issued a security update for poppler fixing 9 CVE.
  • DLA-2445-1. Issued a security update for libmaxminddb fixing 1 CVE.
  • DLA-2447-1. Issued a security update for pacemaker fixing 1 CVE. The update had to be reverted because of an unexpected permission problem. I am in contact with one of the users who reported the regression and my intention is to update pacemaker to the latest supported release in the 1.x branch. If further tests show no regressions anymore, a new update will follow shortly.
  • Investigated CVE-2020-24614 in fossil and marked the issue as no-dsa because the impact for Debian users was low.
  • Investigated the open security vulnerabilities in ansible (11) and prepared some preliminary patches. The work is ongoing.
  • Fixed the remaining zsh vulnerabilities in Stretch in line with Debian 8 „Jessie“, so that all versions in Debian are equally protected.

ELTS

Extended Long Term Support (ELTS) is a project led by Freexian to further extend the lifetime of Debian releases. It is not an official Debian project but all Debian users benefit from it without cost. The current ELTS release is Debian 8 „Jessie“. This was my 29. month and I have been paid to work 15 hours on ELTS.

  • ELA-302-1. Issued a security update for poppler fixing 2 CVE. Investigated Debian bug #942391, identified the root cause and reverted the patch for CVE-2018-13988.
  • ELA-303-1. Issued a security update for junit4 fixing 1 CVE.
  • ELA-316-1. Issued a security update for zsh fixing 7 CVE.

Thanks for reading and see you next time.

22 November, 2020 03:45PM by apo

November 21, 2020

Michael Stapelberg

Debian Code Search: positional index, TurboPFor-compressed

See the Conclusion for a summary if you’re impatient :-)

Motivation

Over the last few months, I have been developing a new index format for Debian Code Search. This required a lot of careful refactoring, re-implementation, debug tool creation and debugging.

Multiple factors motivated my work on a new index format:

  1. The existing index format has a 2G size limit, into which we have bumped a few times, requiring manual intervention to keep the system running.

  2. Debugging the existing system required creating ad-hoc debugging tools, which made debugging sessions unnecessarily lengthy and painful.

  3. I wanted to check whether switching to a different integer compression format would improve performance (it does not).

  4. I wanted to check whether storing positions with the posting lists would improve performance of identifier queries (= queries which are not using any regular expression features), which make up 78.2% of all Debian Code Search queries (it does).

I figured building a new index from scratch was the easiest approach, compared to refactoring the existing index to increase the size limit (point ①).

I also figured it would be a good idea to develop the debugging tool in lock step with the index format so that I can be sure the tool works and is useful (point ②).

Integer compression: TurboPFor

As a quick refresher, search engines typically store document IDs (representing source code files, in our case) in an ordered list (“posting list”). It usually makes sense to apply at least a rudimentary level of compression: our existing system used variable integer encoding.

TurboPFor, the self-proclaimed “Fastest Integer Compression” library, combines an advanced on-disk format with a carefully tuned SIMD implementation to reach better speeds (in micro benchmarks) at less disk usage than Russ Cox’s varint implementation in github.com/google/codesearch.

If you are curious about its inner workings, check out my “TurboPFor: an analysis”.

Applied on the Debian Code Search index, TurboPFor indeed compresses integers better:

Disk space

 
8.9G codesearch varint index

 
5.5G TurboPFor index

Switching to TurboPFor (via cgo) for storing and reading the index results in a slight speed-up of a dcs replay benchmark, which is more pronounced the more i/o is required.

Query speed (regexp, cold page cache)

 
18s codesearch varint index

 
14s TurboPFor index (cgo)

Query speed (regexp, warm page cache)

 
15s codesearch varint index

 
14s TurboPFor index (cgo)

Overall, TurboPFor is an all-around improvement in efficiency, albeit with a high cost in implementation complexity.

Positional index: trade more disk for faster queries

This section builds on the previous section: all figures come from the TurboPFor index, which can optionally support positions.

Conceptually, we’re going from:

type docid uint32
type index map[trigram][]docid

…to:

type occurrence struct {
    doc docid
    pos uint32 // byte offset in doc
}
type index map[trigram][]occurrence

The resulting index consumes more disk space, but can be queried faster:

  1. We can do fewer queries: instead of reading all the posting lists for all the trigrams, we can read the posting lists for the query’s first and last trigram only.
    This is one of the tricks described in the paper “AS-Index: A Structure For String Search Using n-grams and Algebraic Signatures” (PDF), and goes a long way without incurring the complexity, computational cost and additional disk usage of calculating algebraic signatures.

  2. Verifying the delta between the last and first position matches the length of the query term significantly reduces the number of files to read (lower false positive rate).

  3. The matching phase is quicker: instead of locating the query term in the file, we only need to compare a few bytes at a known offset for equality.

  4. More data is read sequentially (from the index), which is faster.

Disk space

A positional index consumes significantly more disk space, but not so much as to pose a challenge: a Hetzner EX61-NVME dedicated server (≈ 64 €/month) provides 1 TB worth of fast NVMe flash storage.

 
 6.5G non-positional

 
123G positional

 
  93G positional (posrel)

The idea behind the positional index (posrel) is to not store a (doc,pos) tuple on disk, but to store positions, accompanied by a stream of doc/pos relationship bits: 1 means this position belongs to the next document, 0 means this position belongs to the current document.

This is an easy way of saving some space without modifying the TurboPFor on-disk format: the posrel technique reduces the index size to about ¾.

With the increase in size, the Linux page cache hit ratio will be lower for the positional index, i.e. more data will need to be fetched from disk for querying the index.

As long as the disk can deliver data as fast as you can decompress posting lists, this only translates into one disk seek’s worth of additional latency. This is the case with modern NVMe disks that deliver thousands of MB/s, e.g. the Samsung 960 Pro (used in Hetzner’s aforementioned EX61-NVME server).

The values were measured by running dcs du -h /srv/dcs/shard*/full without and with the -pos argument.

Bytes read

A positional index requires fewer queries: reading only the first and last trigram’s posting lists and positions is sufficient to achieve a lower (!) false positive rate than evaluating all trigram’s posting lists in a non-positional index.

As a consequence, fewer files need to be read, resulting in fewer bytes required to read from disk overall.

As an additional bonus, in a positional index, more data is read sequentially (index), which is faster than random i/o, regardless of the underlying disk.

1.2G
19.8G
21.0G regexp queries

4.2G (index)
10.8G (files)
15.0G identifier queries

The values were measured by running iostat -d 25 just before running bench.zsh on an otherwise idle system.

Query speed

Even though the positional index is larger and requires more data to be read at query time (see above), thanks to the C TurboPFor library, the 2 queries on a positional index are roughly as fast as the n queries on a non-positional index (≈4s instead of ≈3s).

This is more than made up for by the combined i/o matching stage, which shrinks from ≈18.5s (7.1s i/o + 11.4s matching) to ≈1.3s.

3.3s (index)
7.1s (i/o)
11.4s (matching)
21.8s regexp queries

3.92s (index)
≈1.3s
5.22s identifier queries

Note that identifier query i/o was sped up not just by needing to read fewer bytes, but also by only having to verify bytes at a known offset instead of needing to locate the identifier within the file.

Conclusion

The new index format is overall slightly more efficient. This disk space efficiency allows us to introduce a positional index section for the first time.

Most Debian Code Search queries are positional queries (78.2%) and will be answered much quicker by leveraging the positions.

Bottomline, it is beneficial to use a positional index on disk over a non-positional index in RAM.

21 November, 2020 09:04AM

Linux package managers are slow

I measured how long the most popular Linux distribution’s package manager take to install small and large packages (the ack(1p) source code search Perl script and qemu, respectively).

Where required, my measurements include metadata updates such as transferring an up-to-date package list. For me, requiring a metadata update is the more common case, particularly on live systems or within Docker containers.

All measurements were taken on an Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz running Docker 1.13.1 on Linux 4.19, backed by a Samsung 970 Pro NVMe drive boasting many hundreds of MB/s write performance. The machine is located in Zürich and connected to the Internet with a 1 Gigabit fiber connection, so the expected top download speed is ≈115 MB/s.

See Appendix C for details on the measurement method and command outputs.

Measurements

Keep in mind that these are one-time measurements. They should be indicative of actual performance, but your experience may vary.

ack (small Perl program)

distribution package manager data wall-clock time rate
Fedora dnf 114 MB 33s 3.4 MB/s
Debian apt 16 MB 10s 1.6 MB/s
NixOS Nix 15 MB 5s 3.0 MB/s
Arch Linux pacman 6.5 MB 3s 2.1 MB/s
Alpine apk 10 MB 1s 10.0 MB/s

qemu (large C program)

distribution package manager data wall-clock time rate
Fedora dnf 226 MB 4m37s 1.2 MB/s
Debian apt 224 MB 1m35s 2.3 MB/s
Arch Linux pacman 142 MB 44s 3.2 MB/s
NixOS Nix 180 MB 34s 5.2 MB/s
Alpine apk 26 MB 2.4s 10.8 MB/s


(Looking for older measurements? See Appendix B (2019).

The difference between the slowest and fastest package managers is 30x!

How can Alpine’s apk and Arch Linux’s pacman be an order of magnitude faster than the rest? They are doing a lot less than the others, and more efficiently, too.

Pain point: too much metadata

For example, Fedora transfers a lot more data than others because its main package list is 60 MB (compressed!) alone. Compare that with Alpine’s 734 KB APKINDEX.tar.gz.

Of course the extra metadata which Fedora provides helps some use case, otherwise they hopefully would have removed it altogether. The amount of metadata seems excessive for the use case of installing a single package, which I consider the main use-case of an interactive package manager.

I expect any modern Linux distribution to only transfer absolutely required data to complete my task.

Pain point: no concurrency

Because they need to sequence executing arbitrary package maintainer-provided code (hooks and triggers), all tested package managers need to install packages sequentially (one after the other) instead of concurrently (all at the same time).

In my blog post “Can we do without hooks and triggers?”, I outline that hooks and triggers are not strictly necessary to build a working Linux distribution.

Thought experiment: further speed-ups

Strictly speaking, the only required feature of a package manager is to make available the package contents so that the package can be used: a program can be started, a kernel module can be loaded, etc.

By only implementing what’s needed for this feature, and nothing more, a package manager could likely beat apk’s performance. It could, for example:

  • skip archive extraction by mounting file system images (like AppImage or snappy)
  • use compression which is light on CPU, as networks are fast (like apk)
  • skip fsync when it is safe to do so, i.e.:
    • package installations don’t modify system state
    • atomic package installation (e.g. an append-only package store)
    • automatically clean up the package store after crashes

Current landscape

Here’s a table outlining how the various package managers listed on Wikipedia’s list of software package management systems fare:

name scope package file format hooks/triggers
AppImage apps image: ISO9660, SquashFS no
snappy apps image: SquashFS yes: hooks
FlatPak apps archive: OSTree no
0install apps archive: tar.bz2 no
nix, guix distro archive: nar.{bz2,xz} activation script
dpkg distro archive: tar.{gz,xz,bz2} in ar(1) yes
rpm distro archive: cpio.{bz2,lz,xz} scriptlets
pacman distro archive: tar.xz install
slackware distro archive: tar.{gz,xz} yes: doinst.sh
apk distro archive: tar.gz yes: .post-install
Entropy distro archive: tar.bz2 yes
ipkg, opkg distro archive: tar{,.gz} yes

Conclusion

As per the current landscape, there is no distribution-scoped package manager which uses images and leaves out hooks and triggers, not even in smaller Linux distributions.

I think that space is really interesting, as it uses a minimal design to achieve significant real-world speed-ups.

I have explored this idea in much more detail, and am happy to talk more about it in my post “Introducing the distri research linux distribution".

There are a couple of recent developments going into the same direction:

Appendix C: measurement details (2020)

ack

You can expand each of these:

Fedora’s dnf takes almost 33 seconds to fetch and unpack 114 MB.

% docker run -t -i fedora /bin/bash
[root@62d3cae2e2f9 /]# time dnf install -y ack
Fedora 32 openh264 (From Cisco) - x86_64     1.9 kB/s | 2.5 kB     00:01
Fedora Modular 32 - x86_64                   6.8 MB/s | 4.9 MB     00:00
Fedora Modular 32 - x86_64 - Updates         5.6 MB/s | 3.7 MB     00:00
Fedora 32 - x86_64 - Updates                 9.9 MB/s |  23 MB     00:02
Fedora 32 - x86_64                            39 MB/s |  70 MB     00:01
[…]
real	0m32.898s
user	0m25.121s
sys	0m1.408s

NixOS’s Nix takes a little over 5s to fetch and unpack 15 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -iA nixpkgs.ack'
unpacking channels...
created 1 symlinks in user environment
installing 'perl5.32.0-ack-3.3.1'
these paths will be fetched (15.55 MiB download, 85.51 MiB unpacked):
  /nix/store/34l8jdg76kmwl1nbbq84r2gka0kw6rc8-perl5.32.0-ack-3.3.1-man
  /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31
  /nix/store/9fd4pjaxpjyyxvvmxy43y392l7yvcwy1-perl5.32.0-File-Next-1.18
  /nix/store/czc3c1apx55s37qx4vadqhn3fhikchxi-libunistring-0.9.10
  /nix/store/dj6n505iqrk7srn96a27jfp3i0zgwa1l-acl-2.2.53
  /nix/store/ifayp0kvijq0n4x0bv51iqrb0yzyz77g-perl-5.32.0
  /nix/store/w9wc0d31p4z93cbgxijws03j5s2c4gyf-coreutils-8.31
  /nix/store/xim9l8hym4iga6d4azam4m0k0p1nw2rm-libidn2-2.3.0
  /nix/store/y7i47qjmf10i1ngpnsavv88zjagypycd-attr-2.4.48
  /nix/store/z45mp61h51ksxz28gds5110rf3wmqpdc-perl5.32.0-ack-3.3.1
copying path '/nix/store/34l8jdg76kmwl1nbbq84r2gka0kw6rc8-perl5.32.0-ack-3.3.1-man' from 'https://cache.nixos.org'...
copying path '/nix/store/czc3c1apx55s37qx4vadqhn3fhikchxi-libunistring-0.9.10' from 'https://cache.nixos.org'...
copying path '/nix/store/9fd4pjaxpjyyxvvmxy43y392l7yvcwy1-perl5.32.0-File-Next-1.18' from 'https://cache.nixos.org'...
copying path '/nix/store/xim9l8hym4iga6d4azam4m0k0p1nw2rm-libidn2-2.3.0' from 'https://cache.nixos.org'...
copying path '/nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31' from 'https://cache.nixos.org'...
copying path '/nix/store/y7i47qjmf10i1ngpnsavv88zjagypycd-attr-2.4.48' from 'https://cache.nixos.org'...
copying path '/nix/store/dj6n505iqrk7srn96a27jfp3i0zgwa1l-acl-2.2.53' from 'https://cache.nixos.org'...
copying path '/nix/store/w9wc0d31p4z93cbgxijws03j5s2c4gyf-coreutils-8.31' from 'https://cache.nixos.org'...
copying path '/nix/store/ifayp0kvijq0n4x0bv51iqrb0yzyz77g-perl-5.32.0' from 'https://cache.nixos.org'...
copying path '/nix/store/z45mp61h51ksxz28gds5110rf3wmqpdc-perl5.32.0-ack-3.3.1' from 'https://cache.nixos.org'...
building '/nix/store/m0rl62grplq7w7k3zqhlcz2hs99y332l-user-environment.drv'...
created 49 symlinks in user environment
real	0m 5.60s
user	0m 3.21s
sys	0m 1.66s

Debian’s apt takes almost 10 seconds to fetch and unpack 16 MB.

% docker run -t -i debian:sid
root@1996bb94a2d1:/# time (apt update && apt install -y ack-grep)
Get:1 http://deb.debian.org/debian sid InRelease [146 kB]
Get:2 http://deb.debian.org/debian sid/main amd64 Packages [8400 kB]
Fetched 8546 kB in 1s (8088 kB/s)
[…]
The following NEW packages will be installed:
  ack libfile-next-perl libgdbm-compat4 libgdbm6 libperl5.30 netbase perl perl-modules-5.30
0 upgraded, 8 newly installed, 0 to remove and 23 not upgraded.
Need to get 7341 kB of archives.
After this operation, 46.7 MB of additional disk space will be used.
[…]
real	0m9.544s
user	0m2.839s
sys	0m0.775s

Arch Linux’s pacman takes a little under 3s to fetch and unpack 6.5 MB.

% docker run -t -i archlinux/base
[root@9f6672688a64 /]# time (pacman -Sy && pacman -S --noconfirm ack)
:: Synchronizing package databases...
 core            130.8 KiB  1090 KiB/s 00:00
 extra          1655.8 KiB  3.48 MiB/s 00:00
 community         5.2 MiB  6.11 MiB/s 00:01
resolving dependencies...
looking for conflicting packages...

Packages (2) perl-file-next-1.18-2  ack-3.4.0-1

Total Download Size:   0.07 MiB
Total Installed Size:  0.19 MiB
[…]
real	0m2.936s
user	0m0.375s
sys	0m0.160s

Alpine’s apk takes a little over 1 second to fetch and unpack 10 MB.

% docker run -t -i alpine
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/4) Installing libbz2 (1.0.8-r1)
(2/4) Installing perl (5.30.3-r0)
(3/4) Installing perl-file-next (1.18-r0)
(4/4) Installing ack (3.3.1-r0)
Executing busybox-1.31.1-r16.trigger
OK: 43 MiB in 18 packages
real	0m 1.24s
user	0m 0.40s
sys	0m 0.15s

qemu

You can expand each of these:

Fedora’s dnf takes over 4 minutes to fetch and unpack 226 MB.

% docker run -t -i fedora /bin/bash
[root@6a52ecfc3afa /]# time dnf install -y qemu
Fedora 32 openh264 (From Cisco) - x86_64     3.1 kB/s | 2.5 kB     00:00
Fedora Modular 32 - x86_64                   6.3 MB/s | 4.9 MB     00:00
Fedora Modular 32 - x86_64 - Updates         6.0 MB/s | 3.7 MB     00:00
Fedora 32 - x86_64 - Updates                 334 kB/s |  23 MB     01:10
Fedora 32 - x86_64                            33 MB/s |  70 MB     00:02
[…]

Total download size: 181 M
Downloading Packages:
[…]

real	4m37.652s
user	0m38.239s
sys	0m6.321s

NixOS’s Nix takes almost 34s to fetch and unpack 180 MB.

% docker run -t -i nixos/nix
83971cf79f7e:/# time sh -c 'nix-channel --update && nix-env -iA nixpkgs.qemu'
unpacking channels...
created 1 symlinks in user environment
installing 'qemu-5.1.0'
these paths will be fetched (180.70 MiB download, 1146.92 MiB unpacked):
[…]
real	0m 33.64s
user	0m 16.96s
sys	0m 3.05s

Debian’s apt takes over 95 seconds to fetch and unpack 224 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y qemu-system-x86)
Get:1 http://deb.debian.org/debian sid InRelease [146 kB]
Get:2 http://deb.debian.org/debian sid/main amd64 Packages [8400 kB]
Fetched 8546 kB in 1s (5998 kB/s)
[…]
Fetched 216 MB in 43s (5006 kB/s)
[…]
real	1m25.375s
user	0m29.163s
sys	0m12.835s

Arch Linux’s pacman takes almost 44s to fetch and unpack 142 MB.

% docker run -t -i archlinux/base
[root@58c78bda08e8 /]# time (pacman -Sy && pacman -S --noconfirm qemu)
:: Synchronizing package databases...
 core          130.8 KiB  1055 KiB/s 00:00
 extra        1655.8 KiB  3.70 MiB/s 00:00
 community       5.2 MiB  7.89 MiB/s 00:01
[…]
Total Download Size:   135.46 MiB
Total Installed Size:  661.05 MiB
[…]
real	0m43.901s
user	0m4.980s
sys	0m2.615s

Alpine’s apk takes only about 2.4 seconds to fetch and unpack 26 MB.

% docker run -t -i alpine
/ # time apk add qemu-system-x86_64
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
[…]
OK: 78 MiB in 95 packages
real	0m 2.43s
user	0m 0.46s
sys	0m 0.09s

Appendix B: measurement details (2019)

ack

You can expand each of these:

Fedora’s dnf takes almost 30 seconds to fetch and unpack 107 MB.

% docker run -t -i fedora /bin/bash
[root@722e6df10258 /]# time dnf install -y ack
Fedora Modular 30 - x86_64            4.4 MB/s | 2.7 MB     00:00
Fedora Modular 30 - x86_64 - Updates  3.7 MB/s | 2.4 MB     00:00
Fedora 30 - x86_64 - Updates           17 MB/s |  19 MB     00:01
Fedora 30 - x86_64                     31 MB/s |  70 MB     00:02
[…]
Install  44 Packages

Total download size: 13 M
Installed size: 42 M
[…]
real	0m29.498s
user	0m22.954s
sys	0m1.085s

NixOS’s Nix takes 14s to fetch and unpack 15 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -i perl5.28.2-ack-2.28'
unpacking channels...
created 2 symlinks in user environment
installing 'perl5.28.2-ack-2.28'
these paths will be fetched (14.91 MiB download, 80.83 MiB unpacked):
  /nix/store/57iv2vch31v8plcjrk97lcw1zbwb2n9r-perl-5.28.2
  /nix/store/89gi8cbp8l5sf0m8pgynp2mh1c6pk1gk-attr-2.4.48
  /nix/store/gkrpl3k6s43fkg71n0269yq3p1f0al88-perl5.28.2-ack-2.28-man
  /nix/store/iykxb0bmfjmi7s53kfg6pjbfpd8jmza6-glibc-2.27
  /nix/store/k8lhqzpaaymshchz8ky3z4653h4kln9d-coreutils-8.31
  /nix/store/svgkibi7105pm151prywndsgvmc4qvzs-acl-2.2.53
  /nix/store/x4knf14z1p0ci72gl314i7vza93iy7yc-perl5.28.2-File-Next-1.16
  /nix/store/zfj7ria2kwqzqj9dh91kj9kwsynxdfk0-perl5.28.2-ack-2.28
copying path '/nix/store/gkrpl3k6s43fkg71n0269yq3p1f0al88-perl5.28.2-ack-2.28-man' from 'https://cache.nixos.org'...
copying path '/nix/store/iykxb0bmfjmi7s53kfg6pjbfpd8jmza6-glibc-2.27' from 'https://cache.nixos.org'...
copying path '/nix/store/x4knf14z1p0ci72gl314i7vza93iy7yc-perl5.28.2-File-Next-1.16' from 'https://cache.nixos.org'...
copying path '/nix/store/89gi8cbp8l5sf0m8pgynp2mh1c6pk1gk-attr-2.4.48' from 'https://cache.nixos.org'...
copying path '/nix/store/svgkibi7105pm151prywndsgvmc4qvzs-acl-2.2.53' from 'https://cache.nixos.org'...
copying path '/nix/store/k8lhqzpaaymshchz8ky3z4653h4kln9d-coreutils-8.31' from 'https://cache.nixos.org'...
copying path '/nix/store/57iv2vch31v8plcjrk97lcw1zbwb2n9r-perl-5.28.2' from 'https://cache.nixos.org'...
copying path '/nix/store/zfj7ria2kwqzqj9dh91kj9kwsynxdfk0-perl5.28.2-ack-2.28' from 'https://cache.nixos.org'...
building '/nix/store/q3243sjg91x1m8ipl0sj5gjzpnbgxrqw-user-environment.drv'...
created 56 symlinks in user environment
real	0m 14.02s
user	0m 8.83s
sys	0m 2.69s

Debian’s apt takes almost 10 seconds to fetch and unpack 16 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y ack-grep)
Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [233 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [8270 kB]
Fetched 8502 kB in 2s (4764 kB/s)
[…]
The following NEW packages will be installed:
  ack ack-grep libfile-next-perl libgdbm-compat4 libgdbm5 libperl5.26 netbase perl perl-modules-5.26
The following packages will be upgraded:
  perl-base
1 upgraded, 9 newly installed, 0 to remove and 60 not upgraded.
Need to get 8238 kB of archives.
After this operation, 42.3 MB of additional disk space will be used.
[…]
real	0m9.096s
user	0m2.616s
sys	0m0.441s

Arch Linux’s pacman takes a little over 3s to fetch and unpack 6.5 MB.

% docker run -t -i archlinux/base
[root@9604e4ae2367 /]# time (pacman -Sy && pacman -S --noconfirm ack)
:: Synchronizing package databases...
 core            132.2 KiB  1033K/s 00:00
 extra          1629.6 KiB  2.95M/s 00:01
 community         4.9 MiB  5.75M/s 00:01
[…]
Total Download Size:   0.07 MiB
Total Installed Size:  0.19 MiB
[…]
real	0m3.354s
user	0m0.224s
sys	0m0.049s

Alpine’s apk takes only about 1 second to fetch and unpack 10 MB.

% docker run -t -i alpine
/ # time apk add ack
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
(1/4) Installing perl-file-next (1.16-r0)
(2/4) Installing libbz2 (1.0.6-r7)
(3/4) Installing perl (5.28.2-r1)
(4/4) Installing ack (3.0.0-r0)
Executing busybox-1.30.1-r2.trigger
OK: 44 MiB in 18 packages
real	0m 0.96s
user	0m 0.25s
sys	0m 0.07s

qemu

You can expand each of these:

Fedora’s dnf takes over a minute to fetch and unpack 266 MB.

% docker run -t -i fedora /bin/bash
[root@722e6df10258 /]# time dnf install -y qemu
Fedora Modular 30 - x86_64            3.1 MB/s | 2.7 MB     00:00
Fedora Modular 30 - x86_64 - Updates  2.7 MB/s | 2.4 MB     00:00
Fedora 30 - x86_64 - Updates           20 MB/s |  19 MB     00:00
Fedora 30 - x86_64                     31 MB/s |  70 MB     00:02
[…]
Install  262 Packages
Upgrade    4 Packages

Total download size: 172 M
[…]
real	1m7.877s
user	0m44.237s
sys	0m3.258s

NixOS’s Nix takes 38s to fetch and unpack 262 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -i qemu-4.0.0'
unpacking channels...
created 2 symlinks in user environment
installing 'qemu-4.0.0'
these paths will be fetched (262.18 MiB download, 1364.54 MiB unpacked):
[…]
real	0m 38.49s
user	0m 26.52s
sys	0m 4.43s

Debian’s apt takes 51 seconds to fetch and unpack 159 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y qemu-system-x86)
Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [149 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [8426 kB]
Fetched 8574 kB in 1s (6716 kB/s)
[…]
Fetched 151 MB in 2s (64.6 MB/s)
[…]
real	0m51.583s
user	0m15.671s
sys	0m3.732s

Arch Linux’s pacman takes 1m2s to fetch and unpack 124 MB.

% docker run -t -i archlinux/base
[root@9604e4ae2367 /]# time (pacman -Sy && pacman -S --noconfirm qemu)
:: Synchronizing package databases...
 core       132.2 KiB   751K/s 00:00
 extra     1629.6 KiB  3.04M/s 00:01
 community    4.9 MiB  6.16M/s 00:01
[…]
Total Download Size:   123.20 MiB
Total Installed Size:  587.84 MiB
[…]
real	1m2.475s
user	0m9.272s
sys	0m2.458s

Alpine’s apk takes only about 2.4 seconds to fetch and unpack 26 MB.

% docker run -t -i alpine
/ # time apk add qemu-system-x86_64
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
[…]
OK: 78 MiB in 95 packages
real	0m 2.43s
user	0m 0.46s
sys	0m 0.09s

21 November, 2020 09:04AM

Winding down my Debian involvement

This post is hard to write, both in the emotional sense but also in the “I would have written a shorter letter, but I didn’t have the time” sense. Hence, please assume the best of intentions when reading it—it is not my intention to make anyone feel bad about their contributions, but rather to provide some insight into why my frustration level ultimately exceeded the threshold.

Debian has been in my life for well over 10 years at this point.

A few weeks ago, I have visited some old friends at the Zürich Debian meetup after a multi-year period of absence. On my bike ride home, it occurred to me that the topics of our discussions had remarkable overlap with my last visit. We had a discussion about the merits of systemd, which took a detour to respect in open source communities, returned to processes in Debian and eventually culminated in democracies and their theoretical/practical failings. Admittedly, that last one might be a Swiss thing.

I say this not to knock on the Debian meetup, but because it prompted me to reflect on what feelings Debian is invoking lately and whether it’s still a good fit for me.

So I’m finally making a decision that I should have made a long time ago: I am winding down my involvement in Debian to a minimum.

What does this mean?

Over the coming weeks, I will:

  • transition packages to be team-maintained where it makes sense
  • remove myself from the Uploaders field on packages with other maintainers
  • orphan packages where I am the sole maintainer

I will try to keep up best-effort maintenance of the manpages.debian.org service and the codesearch.debian.net service, but any help would be much appreciated.

For all intents and purposes, please treat me as permanently on vacation. I will try to be around for administrative issues (e.g. permission transfers) and questions addressed directly to me, permitted they are easy enough to answer.

Why?

When I joined Debian, I was still studying, i.e. I had luxurious amounts of spare time. Now, over 5 years of full time work later, my day job taught me a lot, both about what works in large software engineering projects and how I personally like my computer systems. I am very conscious of how I spend the little spare time that I have these days.

The following sections each deal with what I consider a major pain point, in no particular order. Some of them influence each other—for example, if changes worked better, we could have a chance at transitioning packages to be more easily machine readable.

Change process in Debian

The last few years, my current team at work conducted various smaller and larger refactorings across the entire code base (touching thousands of projects), so we have learnt a lot of valuable lessons about how to effectively do these changes. It irks me that Debian works almost the opposite way in every regard. I appreciate that every organization is different, but I think a lot of my points do actually apply to Debian.

In Debian, packages are nudged in the right direction by a document called the Debian Policy, or its programmatic embodiment, lintian.

While it is great to have a lint tool (for quick, local/offline feedback), it is even better to not require a lint tool at all. The team conducting the change (e.g. the C++ team introduces a new hardening flag for all packages) should be able to do their work transparent to me.

Instead, currently, all packages become lint-unclean, all maintainers need to read up on what the new thing is, how it might break, whether/how it affects them, manually run some tests, and finally decide to opt in. This causes a lot of overhead and manually executed mechanical changes across packages.

Notably, the cost of each change is distributed onto the package maintainers in the Debian model. At work, we have found that the opposite works better: if the team behind the change is put in power to do the change for as many users as possible, they can be significantly more efficient at it, which reduces the total cost and time a lot. Of course, exceptions (e.g. a large project abusing a language feature) should still be taken care of by the respective owners, but the important bit is that the default should be the other way around.

Debian is lacking tooling for large changes: it is hard to programmatically deal with packages and repositories (see the section below). The closest to “sending out a change for review” is to open a bug report with an attached patch. I thought the workflow for accepting a change from a bug report was too complicated and started mergebot, but only Guido ever signaled interest in the project.

Culturally, reviews and reactions are slow. There are no deadlines. I literally sometimes get emails notifying me that a patch I sent out a few years ago (!!) is now merged. This turns projects from a small number of weeks into many years, which is a huge demotivator for me.

Interestingly enough, you can see artifacts of the slow online activity manifest itself in the offline culture as well: I don’t want to be discussing systemd’s merits 10 years after I first heard about it.

Lastly, changes can easily be slowed down significantly by holdouts who refuse to collaborate. My canonical example for this is rsync, whose maintainer refused my patches to make the package use debhelper purely out of personal preference.

Granting so much personal freedom to individual maintainers prevents us as a project from raising the abstraction level for building Debian packages, which in turn makes tooling harder.

How would things look like in a better world?

  1. As a project, we should strive towards more unification. Uniformity still does not rule out experimentation, it just changes the trade-off from easier experimentation and harder automation to harder experimentation and easier automation.
  2. Our culture needs to shift from “this package is my domain, how dare you touch it” to a shared sense of ownership, where anyone in the project can easily contribute (reviewed) changes without necessarily even involving individual maintainers.

To learn more about how successful large changes can look like, I recommend my colleague Hyrum Wright’s talk “Large-Scale Changes at Google: Lessons Learned From 5 Yrs of Mass Migrations”.

Fragmented workflow and infrastructure

Debian generally seems to prefer decentralized approaches over centralized ones. For example, individual packages are maintained in separate repositories (as opposed to in one repository), each repository can use any SCM (git and svn are common ones) or no SCM at all, and each repository can be hosted on a different site. Of course, what you do in such a repository also varies subtly from team to team, and even within teams.

In practice, non-standard hosting options are used rarely enough to not justify their cost, but frequently enough to be a huge pain when trying to automate changes to packages. Instead of using GitLab’s API to create a merge request, you have to design an entirely different, more complex system, which deals with intermittently (or permanently!) unreachable repositories and abstracts away differences in patch delivery (bug reports, merge requests, pull requests, email, …).

Wildly diverging workflows is not just a temporary problem either. I participated in long discussions about different git workflows during DebConf 13, and gather that there were similar discussions in the meantime.

Personally, I cannot keep enough details of the different workflows in my head. Every time I touch a package that works differently than mine, it frustrates me immensely to re-learn aspects of my day-to-day.

After noticing workflow fragmentation in the Go packaging team (which I started), I tried fixing this with the workflow changes proposal, but did not succeed in implementing it. The lack of effective automation and slow pace of changes in the surrounding tooling despite my willingness to contribute time and energy killed any motivation I had.

Old infrastructure: package uploads

When you want to make a package available in Debian, you upload GPG-signed files via anonymous FTP. There are several batch jobs (the queue daemon, unchecked, dinstall, possibly others) which run on fixed schedules (e.g. dinstall runs at 01:52 UTC, 07:52 UTC, 13:52 UTC and 19:52 UTC).

Depending on timing, I estimated that you might wait for over 7 hours (!!) before your package is actually installable.

What’s worse for me is that feedback to your upload is asynchronous. I like to do one thing, be done with it, move to the next thing. The current setup requires a many-minute wait and costly task switch for no good technical reason. You might think a few minutes aren’t a big deal, but when all the time I can spend on Debian per day is measured in minutes, this makes a huge difference in perceived productivity and fun.

The last communication I can find about speeding up this process is ganneff’s post from 2008.

How would things look like in a better world?

  1. Anonymous FTP would be replaced by a web service which ingests my package and returns an authoritative accept or reject decision in its response.
  2. For accepted packages, there would be a status page displaying the build status and when the package will be available via the mirror network.
  3. Packages should be available within a few minutes after the build completed.

Old infrastructure: bug tracker

I dread interacting with the Debian bug tracker. debbugs is a piece of software (from 1994) which is only used by Debian and the GNU project these days.

Debbugs processes emails, which is to say it is asynchronous and cumbersome to deal with. Despite running on the fastest machines we have available in Debian (or so I was told when the subject last came up), its web interface loads very slowly.

Notably, the web interface at bugs.debian.org is read-only. Setting up a working email setup for reportbug(1) or manually dealing with attachments is a rather big hurdle.

For reasons I don’t understand, every interaction with debbugs results in many different email threads.

Aside from the technical implementation, I also can never remember the different ways that Debian uses pseudo-packages for bugs and processes. I need them rarely enough to establish a mental model of how they are set up, or working memory of how they are used, but frequently enough to be annoyed by this.

How would things look like in a better world?

  1. Debian would switch from a custom bug tracker to a (any) well-established one.
  2. Debian would offer automation around processes. It is great to have a paper-trail and artifacts of the process in the form of a bug report, but the primary interface should be more convenient (e.g. a web form).

Old infrastructure: mailing list archives

It baffles me that in 2019, we still don’t have a conveniently browsable threaded archive of mailing list discussions. Email and threading is more widely used in Debian than anywhere else, so this is somewhat ironic. Gmane used to paper over this issue, but Gmane’s availability over the last few years has been spotty, to say the least (it is down as I write this).

I tried to contribute a threaded list archive, but our listmasters didn’t seem to care or want to support the project.

Debian is hard to machine-read

While it is obviously possible to deal with Debian packages programmatically, the experience is far from pleasant. Everything seems slow and cumbersome. I have picked just 3 quick examples to illustrate my point.

debiman needs help from piuparts in analyzing the alternatives mechanism of each package to display the manpages of e.g. psql(1). This is because maintainer scripts modify the alternatives database by calling shell scripts. Without actually installing a package, you cannot know which changes it does to the alternatives database.

pk4 needs to maintain its own cache to look up package metadata based on the package name. Other tools parse the apt database from scratch on every invocation. A proper database format, or at least a binary interchange format, would go a long way.

Debian Code Search wants to ingest new packages as quickly as possible. There used to be a fedmsg instance for Debian, but it no longer seems to exist. It is unclear where to get notifications from for new packages, and where best to fetch those packages.

Complicated build stack

See my “Debian package build tools” post. It really bugs me that the sprawl of tools is not seen as a problem by others.

Developer experience pretty painful

Most of the points discussed so far deal with the experience in developing Debian, but as I recently described in my post “Debugging experience in Debian”, the experience when developing using Debian leaves a lot to be desired, too.

I have more ideas

At this point, the article is getting pretty long, and hopefully you got a rough idea of my motivation.

While I described a number of specific shortcomings above, the final nail in the coffin is actually the lack of a positive outlook. I have more ideas that seem really compelling to me, but, based on how my previous projects have been going, I don’t think I can make any of these ideas happen within the Debian project.

I intend to publish a few more posts about specific ideas for improving operating systems here. Stay tuned.

Lastly, I hope this post inspires someone, ideally a group of people, to improve the developer experience within Debian.

21 November, 2020 09:04AM

November 20, 2020

hackergotchi for Shirish Agarwal

Shirish Agarwal

Rights, Press freedom and India

In some ways it is sad and interesting to see how personal liberty is viewed in India. And how it differs from those having the highest fame and power can get a different kind of justice then the rest cannot.

Arnab Goswami

This particular gentleman is a class apart. He is the editor as well as Republic TV, a right-leaning channel which demonizes the minority, women whatever is antithesis to the Central Govt. of India. As a result there have been a spate of cases against him in the past few months. But surprisingly, in each of them he got hearing the day after the suit was filed. This is unique in Indian legal history so much so that a popular legal site which publishes on-going cases put up a post sharing how he was getting prompt hearings. That post itself needs to be updated as there have been 3 more hearings which have been done back to back for him. This is unusual as there have been so many cases pending for the SC attention, some arguably more important than this gentleman . So many precedents have been set which will send a wrong message. The biggest one, that even though a trial is taking place in the sessions court (below High Court) the SC can interject on matters. What this will do to the morale of both lawyers as well as judges of the various Sessions Court is a matter of speculation and yet as shared unprecedented. The saddest part was when Justice Chandrachud said –

Justice Chandrachud – If you don’t like a channel then don’t watch it. – 11th November 2020 .

This is basically giving a free rope to hate speech. How can a SC say like that ? And this is the Same Supreme Court which could not take two tweets from Shri Prashant Bhushan when he made remarks against the judiciary .

J&K pleas in Supreme Court pending since August 2019 (Abrogation 370)

After abrogation of 370, citizens of Jammu and Kashmir, the population of which is 13.6 million people including 4 million Hindus have been stuck with reduced rights and their land being taken away due to new laws. Many of the Hindus which regionally are a minority now rue the fact that they supported the abrogation of 370A . Imagine, a whole state whose answers and prayers have not been heard by the Supreme Court and the people need to move a prayer stating the same.

100 Journalists, activists languishing in Jail without even a hearing

55 Journalists alone have been threatened, booked and in jail for reporting of pandemic . Their fault, they were bring the irregularities, corruption made during the pandemic early months. Activists such as Sudha Bharadwaj, who giving up her American citizenship and settling to fight for tribals is in jail for 2 years without any charges. There are many like her, There are several more petitions lying in the Supreme Court, for e.g. Varavara Rao, not a single hearing from last couple of years, even though he has taken part in so many national movements including the emergency as well as part-responsible for creation of Telengana state out of Andhra Pradesh .

Then there is Devangana kalita who works for gender rights. Similar to Sudha Bharadwaj, she had an opportunity to go to UK and settle here. She did her master’s and came back. And now she is in jail for the things that she studied. While she took part in Anti-CAA sittings, none of her speeches were incendiary but she still is locked up under UAPA (Unlawful Practises Act) . I could go on and on but at the moment these should suffice.

Petitions for Hate Speech which resulted in riots in Delhi are pending, Citizen’s Amendment Act (controversial) no hearings till date. All of the best has been explained in a newspaper article which articulates perhaps all that I wanted to articulate and more. It is and was amazing to see how in certain cases Article 32 is valid and in many it is not. Also a fair reading of Justice Bobde’s article tells you a lot how the SC is functioning. I would like to point out that barandbench along with livelawindia makes it easier for never non-lawyers and public to know how arguments are done in court, what evidences are taken as well as give some clue about judicial orders and judgements. Both of these resources are providing an invaluable service and more often than not, free of charge.

Student Suicide and High Cost of Education

For quite sometime now, the cost of education has been shooting up. While I have visited this topic earlier as well, recently a young girl committed suicide because she was unable to pay the fees as well as additional costs due to pandemic. Further investigations show that this is the case with many of the students who are unable to buy laptops. Now while one could think it is limited to one college then it would be wrong. It is almost across all India and this will continue for months and years. People do know that the pandemic is going to last a significant time and it would be a long time before R value becomes zero . Even the promising vaccine from Pfizer need constant refrigeration which is sort of next to impossible in India. It is going to make things very costly.

Last Nail on Indian Media

Just today the last nail on India has been put. Thankfully Freedom Gazette India did a much better job so just pasting that –

Information and Broadcasting Ministry bringing OTT services as well as news within its ambit.

With this, projects like Scam 1992, The Harshad Mehta Story or Bad Boy Billionaires:India, Test Case, Delhi Crime, Laakhon Mein Ek etc. etc. such kind of series, investigative journalism would be still-births. Many of these web-series also shared tales of woman empowerment while at the same time showed some of the hard choices that women had to contend to live with.

Even western media may be censored where it finds the political discourse not to its liking. There had been so many accounts of Mr. Ravish Kumar, the winner of Ramon Magsaysay, how in his shows the electricity was cut in many places. I too have been the victim when the BJP governed in Maharashtra as almost all Puneities experienced it. Light would go for just half or 45 minutes at the exact time.

There is another aspect to it. The U.S. elections showed how independent media was able to counter Mr. Trump’s various falsehoods and give rise to alternative ideas which lead the team of Bernie Sanders, Joe Biden and Kamala Harris, Biden now being the President-elect while Kamala Harris being the vice-president elect. Although the journey to the white house seems as tough as before. Let’s see what happens.

Hopefully 2021 will bring in some good news.



20 November, 2020 07:39PM by shirishag75

November 19, 2020

Molly de Blanc

Transparency

Technology must be transparent in order to be knowable. Technology must be knowable in order for us to be able to consent to it in good faith. Good faith informed consent is necessary to preserving our (digital) autonomy.

Let’s now look at this in reverse, considering first why informed consent is necessary to our digital autonomy.

Let’s take the concept of our digital autonomy as being one of the highest goods. It is necessary to preserve and respect the value of each individual, and the collectives we choose to form. It is a right to which we are entitled by our very nature, and a prerequisite for building the lives we want, that fulfill us. This is something that we have generally agreed on as important or even sacred. Our autonomy, in whatever form it takes, in whatever part of our life it governs, is necessary and must be protected.

One of the things we must do in order to accomplish this is to build a practice and culture of consent. Giving consent — saying yes — is not enough. This consent must come from a place of understand to that which one is consenting. “Informed consent is consenting to the unknowable.”(1)

Looking at sexual consent as a parallel, even when we have a partner who discloses their sexual history and activities, we cannot know whether they are being truthful and complete. Let’s even say they are and that we can trust this, there is a limit to how much even they know about their body, health, and experience. They might not know the extent of their other partners’ experience. They might be carrying HPV without symptoms; we rarely test for herpes.

Arguably, we have more potential to definitely know what is occurring when it comes to technological consent. Technology can be broken apart. We can share and examine code, schematics, and design documentation. Certainly, lots of information is being hidden from us — a lot of code is proprietary, technical documentation unavailable, and the skills to process these things is treated as special, arcane, and even magical. Tracing the resource pipelines for the minerals and metals essential to building circuit boards is not possible for the average person. Knowing the labor practices of each step of this process, and understanding what those imply for individuals, societies, and the environments they exist in seems improbable at best.

Even though true informed consent might not be possible, it is an ideal towards which we must strive. We must work with what we have, and we must be provided as much as possible.

A periodic conversation that arises in the consideration of technology rights is whether companies should build backdoors into technology for the purpose of government exploitation. A backdoor is a hidden vulnerability in a piece of technology that, when used, would afford someone else access to your device or work or cloud storage or whatever. As long as the source code that powers computing technology is proprietary and opaque, we cannot truly know whether backdoors exist and how secure we are in our digital spaces and even our own computers, phones, and other mobile devices.

We must commit wholly to transparency and openness in order to create the possibility of as-informed-as-possible consent in order to protect our digital autonomy. We cannot exist in a vacuum and practical autonomy relies on networks of truth in order to provide the opportunity for the ideal of informed consent. These networks of truth are created through the open availability and sharing of information, relating to how and why technology works the way it does.

(1) Heintzman, Kit. 2020.

19 November, 2020 03:24PM by mollydb

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

COVID-19 vaccine confidence intervals

I keep hearing about new vaccines being “at least 90% effective”, “94.5% effective”, “92% effective” etc... and that's obviously really good news. But is that a point estimate, or a confidence interval? Does 92% mean “anything from 70% to 99%”, given that n=20?

I dusted off the memories of how bootstrapping works (I didn't want to try to figure out whether one could really approximate using the Cauchy distribution or not) and wrote some R code. Obviously, don't use this for medical or policy decisions since I don't have a background in neither medicine nor medical statistics. But it's uplifting results nevertheless; here from the Pfizer/BioNTech data that I could find:

> N <- 43538 / 2
> infected_vaccine <- c(rep(1, times = 8), rep(0, times=N-8))
> infected_placebo <- c(rep(1, times = 162), rep(0, times=N-162))
>
> infected <- c(infected_vaccine, infected_placebo)
> vaccine <- c(rep(1, times=N), rep(0, times=N))
> mydata <- data.frame(infected, vaccine)
>
> library(boot)
> rsq <- function(data, indices) {
+   d <- data[indices,]
+   num_infected_vaccine <- sum(d[which(d$vaccine == 1), ]$infected)
+   num_infected_placebo <- sum(d[which(d$vaccine == 0), ]$infected)
+   return(1.0 - num_infected_vaccine / num_infected_placebo)
+ }
>
> results <- boot(data=mydata, statistic=rsq, R=1000)
> results

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = mydata, statistic = rsq, R = 1000)


Bootstrap Statistics :
     original       bias    std. error
t1* 0.9506173 -0.001428342  0.01832874
> boot.ci(results, type="perc")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL :
boot.ci(boot.out = results, type = "perc")

Intervals :
Level     Percentile
95%   ( 0.9063,  0.9815 )
Calculations and Intervals on Original Scale

So that would be a 95% CI of between 90.6% and 98.1% effective, roughly. The confidence intervals might be slightly too wide, since I didn't have enough RAM (!) to run the bootstrap calibrated ones (BCa).

Again, take it with a grain of salt. Corrections welcome. :-)

19 November, 2020 09:39AM

hackergotchi for Daniel Silverstone

Daniel Silverstone

Withdrawing Gitano from support

Unfortunately, in Debian in particular, libgit2 is undergoing a transition which is blocked by gall. Despite having had over a month to deal with this, I've not managed to summon the tuits to update Gall to the new libgit2 which means, nominally, I ought to withdraw it from testing and possibly even from unstable given that I'm not really prepared to look after Gitano and friends in Debian any longer.

However, I'd love for Gitano to remain in Debian if it's useful to people. Gall isn't exactly a large piece of C code, and so probably won't be a huge job to do the port, I simply don't have the desire/energy to do it myself.

If someone wanted to do the work and provide a patch / "pull request" to me, then I'd happily take on the change and upload a new package, or if someone wanted to NMU the gall package in Debian I'll take the change they make and import it into upstream. I just don't have the energy to reload all that context and do the change myself.

If you want to do this, email me and let me know, so I can support you and take on the change when it's done. Otherwise I probably will go down the lines of requesting Gitano's removal from Debian in the next week or so.

19 November, 2020 08:49AM by Daniel Silverstone

November 17, 2020

hackergotchi for Daniel Pocock

Daniel Pocock

Online Meetings: The Temptation to Censor Tricky Questions

Early in 2020, at the outset of the pandemic, the UN's special rapporteur on torture and other cruel, inhumane or degrading treatment or punishment, Professor Nils Melzer of Switzerland, spoke out about the growing problem of cybertorture.

I could immediately relate to this. I had been a volunteer mentor in the Google Summer of Code since 2013. I withdrew from the program at an acutely painful time for my family, losing two family members in less than a year. Within a week, Stephanie Taylor, the head of the program at Google, was sending me threats and insults. Taylor's Open Source Program Office at Google relies on hundreds of volunteers like me to do work for them.

Everybody else in my life, my employer, friends and other non-profit organizations that I contribute to responded with compassion and sympathy. Taylor and her associates chose threats and insults. Taylor had chosen to take the pain from my personal life and drag it into my professional life. Despite exercising my rights under the GDPR and asking her to stop this experiment and get out of my life, she continues to sustain it.

The UN's Forum on Business and Human Rights is taking place this week. It is online due to the pandemic. In the session about accountability and remedies for victims of human rights abuse, my experience with Google and people like Taylor was at the front of my mind. I'm not the only one thinking about Google as a bunch of gangsters: a British parliamentary report and US Department of Justice investigation has also used terms like digital gangster and unlawful to describe the way that people like this are operating.

Yet when I entered the UN's online event and asked a very general question about the connection from Professor Melzer's analysis to Google's modus operandi, the question vanished. I posted a subsequent question asking why my query was censored and it was immediately subject to censorship. This is the golden rule of censorship: don't ask about censorship. I never received any correspondence or complaints about the question.

united nations, censorship

Article 19 of the Universal Declaration of Human Rights proclaims the right to free speech. Within the first year of the pandemic, the UN has already set that aside, not wanting to offend the Googlers who have planted lobbyists everywhere in the corridors of power. If somebody asked the same question in a real world event at the UN in Geneva or New York, would a trap door open up underneath them and make them disappear? Or would members of the panel and the audience need to contemplate Professor Melzer's work on cybertorture seriously?

Some people suggested the spam-filter may have been triggered by my name. This is simply offensive and once again, there is no parallel to this at a real world event.

If you wish to participate in the final day of the forum, you can use the following links:

This incident emphasizes the extent to which online events are been scripted and choreographed to look like spontaneous discussions while in reality, they are maintaining the status quo.

This demonstrates a huge difference between real world events and online events. In a real world event, when somebody stands up to ask a question, the chair of the meeting or the panel has no forewarning about the question. Online events change that dramatically. Observers may not know which questions were really avoided.

When technology gives leaders the opportunity to simply avoid difficult questions, they are tempted to do so, whether it is in the annual meeting of a local bridge club or a UN forum.

daniel pocock, dante pesce, united nations, geneva, palais des nations daniel pocock, united nations, geneva, palais des nations

17 November, 2020 10:50PM

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Freexian’s report about Debian Long Term Support, October 2020

A Debian LTS logo Like each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In October, 221.50 work hours have been dispatched among 13 paid contributors. Their reports are available:
  • Abhijith PA did 16.0h (out of 14h assigned and 2h from September).
  • Adrian Bunk did 7h (out of 20.75h assigned and 5.75h from September), thus carrying over 19.5h to November.
  • Ben Hutchings did 11.5h (out of 6.25h assigned and 9.75h from September), thus carrying over 4.5h to November.
  • Brian May did 10h (out of 10h assigned).
  • Chris Lamb did 18h (out of 18h assigned).
  • Emilio Pozuelo Monfort did 20.75h (out of 20.75h assigned).
  • Holger Levsen did 7.0h coordinating/managing the LTS team.
  • Markus Koschany did 20.75h (out of 20.75h assigned).
  • Mike Gabriel gave back the 8h he was assigned. See below 🙂
  • Ola Lundqvist did 10.5h (out of 8h assigned and 2.5h from September).
  • Roberto C. Sánchez did 13.5h (out of 20.75h assigned) and gave back 7.25h to the pool.
  • Sylvain Beucler did 20.75h (out of 20.75h assigned).
  • Thorsten Alteholz did 20.75h (out of 20.75h assigned).
  • Utkarsh Gupta did 20.75h (out of 20.75h assigned).

Evolution of the situation

October was a regular LTS month with a LTS team meeting done via video chat thus there’s no log to be shared. After more than five years of contributing to LTS (and ELTS), Mike Gabriel announced that he founded a new company called Frei(e) Software GmbH and thus would leave us to concentrate on this new endeavor. Best of luck with that, Mike! So, once again, this is a good moment to remind that we are constantly looking for new contributors. Please contact Holger if you are interested!

The security tracker currently lists 42 packages with a known CVE and the dla-needed.txt file has 39 packages needing an update.

Thanks to our sponsors

Sponsors that joined recently are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

17 November, 2020 09:06AM by Raphaël Hertzog

hackergotchi for Jaldhar Vyas

Jaldhar Vyas

Sal Mubarak 2077!

[Celebrating Diwali wearing a mask]

Best wishes to the entire Debian world for a happy, prosperous and safe Gujarati new year, Vikram Samvat 2077 named Paridhawi.

17 November, 2020 08:05AM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppArmadillo 0.10.1.2.0

armadillo image

Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards a good balance between speed and ease of use with a syntax deliberately close to a Matlab. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 779 other packages on CRAN.

This release ties up a few loose ends from the recent 0.10.1.0.0.

Changes in RcppArmadillo version 0.10.1.2.0 (2020-11-15)

  • Upgraded to Armadillo release 10.1.2 (Orchid Ambush)

  • Remove three unused int constants (#313)

  • Include main armadillo header using quotes instead of brackets

  • Rewrite version number use in old-school mode because gcc 4.8.5

  • Skipping parts of sparse conversion on Windows as win-builder fails

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 November, 2020 02:03AM

RcppAnnoy 0.0.17

annoy image

A new release 0.0.17 of RcppAnnoy is now on CRAN. RcppAnnoy is the Rcpp-based R integration of the nifty Annoy library by Erik Bernhardsson. Annoy is a small and lightweight C++ template header library for very fast approximate nearest neighbours—originally developed to drive the famous Spotify music discovery algorithm.

This release brings a new upstream version 1.17, released a few weeks ago, which adds multithreaded index building. This changes the API by adding a new ‘threading policy’ parameter requiring code using the main Annoy header to update. For this reason we waited a little for the dust to settle on the BioConductor 3.12 release before bringing the changes to BiocNeighbors via this commit and to uwot via this simple PR. Aaron and James updated their packages accordingly so by the time I uploaded RcppAnnoy it made for very smooth sailing as we all had done our homework with proper conditional builds, and the package had no other issue preventing automated processing at CRAN. Yay. I also added a (somewhat overdue one may argue) header file RcppAnnoy.h regrouping defines and includes which should help going forward.

Detailed changes follow below.

Changes in version 0.0.17 (2020-11-15)

  • Upgrade to Annoy 1.17, but default to serial use.

  • Add new header file to regroup includes and defines.

  • Upgrade CI script to use R with bspm on focal.

Courtesy of my CRANberries, there is also a diffstat report for this release.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 November, 2020 01:48AM

hackergotchi for David Bremner

David Bremner

Reading MPS files with glpk

Recently I was asked how to read mps (old school linear programming input) files. I couldn't think of a completely off the shelf way to do, so I write a simple c program to use the glpk library.

Of course in general you would want to do something other than print it out again.

17 November, 2020 12:29AM

Tangling multiple files

I have lately been using org-mode literate programming to generate example code and beamer slides from the same source. I hit a wall trying to re-use functions in multiple files, so I came up with the following hack. Thanks 'ngz' on #emacs and Charles Berry on the org-mode list for suggestions and discussion.

(defun db-extract-tangle-includes ()
  (goto-char (point-min))
  (let ((case-fold-search t)
        (retval nil))
    (while (re-search-forward "^#[+]TANGLE_INCLUDE:" nil t)
      (let ((element (org-element-at-point)))
        (when (eq (org-element-type element) 'keyword)
          (push (org-element-property :value element) retval))))
    retval))

(defun db-ob-tangle-hook ()
  (let ((includes (db-extract-tangle-includes)))
    (mapc #'org-babel-lob-ingest includes)))

(add-hook 'org-babel-pre-tangle-hook #'db-ob-tangle-hook t)

Use involves something like the following in your org-file.

#+SETUPFILE: presentation-settings.org
#+SETUPFILE: tangle-settings.org
#+TANGLE_INCLUDE: lecture21.org
#+TITLE: GC V: Mark & Sweep with free list

For batch export with make, I do something like

%.tangle-stamp: %.org
    emacs --batch --quick  -l org  -l ${HOME}/.emacs.d/org-settings.el --eval "(org-babel-tangle-file \"$<\")"
    touch $@

17 November, 2020 12:29AM

Debcamp activities 2018

Emacs

2018-07-23

  • NMUed cdargs
  • NMUed silversearcher-ag-el
  • Uploaded the partially unbundled emacs-goodies-el to Debian unstable
  • packaged and uploaded graphviz-dot-mode

2018-07-24

  • packaged and uploaded boxquote-el
  • uploaded apache-mode-el
  • Closed bugs in graphviz-dot-mode that were fixed by the new version.
  • filed lintian bug about empty source package fields

2018-07-25

  • packaged and uploaded emacs-session
  • worked on sponsoring tabbar-el

2018-07-25

  • uploaded dh-make-elpa

Notmuch

2018-07-2[23]

Wrote patch series to fix bug noticed by seanw while (seanw was) working working on a package inspired by policy workflow.

2018-07-25

  • Finished reviewing a patch series from dkg about protected headers.

2018-07-26

  • Helped sean w find right config option for his bug report

  • Reviewed change proposal from aminb, suggested some issues to watch out for.

2018-07-27

  • Add test for threading issue.

Nullmailer

2018-07-25

  • uploaded nullmailer backport

2018-07-26

  • add "envelopefilter" feature to remotes in nullmailer-ssh

Perl

2018-07-23

2018-07-24

  • Forwarded #704527 to https://rt.cpan.org/Ticket/Display.html?id=125914

2018-07-25

  • Uploaded libemail-abstract-perl to fix Vcs-* urls
  • Updated debhelper compat and Standards-Version for libemail-thread-perl
  • Uploaded libemail-thread-perl

2018-07-27

  • fixed RC bug #904727 (blocking for perl transition)

Policy and procedures

2018-07-22

  • seconded #459427

2018-07-23

  • seconded #813471
  • seconded #628515

2018-07-25

  • read and discussed draft of salvaging policy with Tobi

2018-07-26

  • Discussed policy bug about short form License and License-Grant

2018-07-27

  • worked with Tobi on salvaging proposal

17 November, 2020 12:29AM

Yet another buildinfo database.

What?

I previously posted about my extremely quick-and-dirty buildinfo database using buildinfo-sqlite. This year at DebConf, I re-implimented this using PostgreSQL backend, added into some new features.

There is already buildinfo and buildinfos. I was informed I need to think up a name that clearly distinguishes from those two. Thus I give you builtin-pho.

There's a README for how to set up a local database. You'll need 12GB of disk space for the buildinfo files and another 4GB for the database (pro tip: you might want to change the location of your PostgreSQL data_directory, depending on how roomy your /var is)

Demo 1: find things build against old / buggy Build-Depends

select distinct p.source,p.version,d.version, b.path
from
      binary_packages p, builds b, depends d
where
      p.suite='sid' and b.source=p.source and
      b.arch_all and p.arch = 'all'
      and p.version = b.version
      and d.id=b.id and d.depend='dh-elpa'
      and d.version < debversion '1.16'

Demo 2: find packages in sid without buildinfo files

select distinct p.source,p.version
from
      binary_packages p
where
      p.suite='sid'
except
        select p.source,p.version
from binary_packages p, builds b
where
      b.source=p.source
      and p.version=b.version
      and ( (b.arch_all and p.arch='all') or
            (b.arch_amd64 and p.arch='amd64') )

Disclaimer

Work in progress by an SQL newbie.

17 November, 2020 12:29AM

git-annex and ikiwiki, not as hard as I expected

Background

So apparently there's this pandemic thing, which means I'm teaching "Alternate Delivery" courses now. These are just like online courses, except possibly more synchronous, definitely less polished, and the tuition money doesn't go to the College of Extended Learning. I figure I'll need to manage share videos, and our learning management system, in the immortal words of Marie Kondo, does not bring me joy. This has caused me to revisit the problem of sharing large files in an ikiwiki based site (like the one you are reading).

My goto solution for large file management is git-annex. The last time I looked at this (a decade ago or so?), I was blocked by git-annex using symlinks and ikiwiki ignoring them for security related reasons. Since then two things changed which made things relatively easy.

  1. I started using the rsync_command ikiwiki option to deploy my site.

  2. git-annex went through several design iterations for allowing non-symlink access to large files.

TL;DR

In my ikiwiki config

    # attempt to hardlink source files? (optimisation for large files)
    hardlink => 1,

In my ikiwiki git repo

$ git annex init
$ git annex add foo.jpg
$ git commit -m'add big photo'
$ git annex adjust --unlock                 # look ikiwiki, no symlinks
$ ikiwiki --setup ~/.config/ikiwiki/client  # rebuild my local copy, for review
$ ikiwiki --setup /home/bremner/.config/ikiwiki/rsync.setup --refresh  # deploy

You can see the result at photo

17 November, 2020 12:29AM

November 16, 2020

hackergotchi for Bits from Debian

Bits from Debian

New Debian Developers and Maintainers (September and October 2020)

The following contributors got their Debian Developer accounts in the last two months:

  • Benda XU (orv)
  • Joseph Nahmias (jello)
  • Marcos Fouces (marcos)
  • Hayashi Kentaro (kenhys)
  • James Valleroy (jvalleroy)
  • Helge Deller (deller)

The following contributors were added as Debian Maintainers in the last two months:

  • Ricardo Ribalda Delgado
  • Pierre Gruet
  • Henry-Nicolas Tourneur
  • Aloïs Micard
  • Jérôme Lebleu
  • Nis Martensen
  • Stephan Lachnit
  • Felix Salfelder
  • Aleksey Kravchenko
  • Étienne Mollier

Congratulations!

16 November, 2020 07:00PM by Jean-Pierre Giraud

Antoine Beaupré

CVE-2020-13777 GnuTLS audit: be scared

So CVE-2020-13777 came out while I wasn't looking last week. The GnuTLS advisory (GNUTLS-SA-2020-06-03) is pretty opaque so I'll refer instead to this tweet from @FiloSottile (Go team security lead):

PSA: don't rely on GnuTLS, please.

CVE-2020-13777 Whoops, for the past 10 releases most TLS 1.0–1.2 connection could be passively decrypted and most TLS 1.3 connections intercepted. Trivially.

Also, TLS 1.2–1.0 session tickets are awful.

You are reading this correctly: supposedly encrypted TLS connections made with affected GnuTLS releases are vulnerable to passive cleartext recovery attack (and active for 1.3, but who uses that anyways). That is extremely bad. It's pretty close to just switching everyone to HTTP instead of HTTPS, more or less. I would have a lot more to say about the security of GnuTLS in particular -- and security in general -- but I am mostly concerned about patching holes in the roof right now, so this article is not about that.

This article is about figuring out what, exactly, was exposed in our infrastructure because of this.

Affected packages

Assuming you're running Debian, this will show a list of packages that Depends on GnuTLS:

apt-cache --installed rdepends libgnutls30 | grep '^ ' | sort -u

This assumes you run this only on hosts running Buster or above. Otherwise you'll need to figure out a way to pick machines running GnuTLS 3.6.4 or later.

Note that this list only first level dependencies! It is perfectly possible that another package uses GnuTLS without being listed here. For example, in the above list I have libcurl3-gnutls, so the be really thorough, I would actually need to recurse down the dependency tree.

On my desktop, this shows an "interesting" list of targets:

  • apt
  • cadaver - AKA WebDAV
  • curl & wget
  • fwupd - another attack on top of this one
  • git (through the libcurl3-gnutls dependency)
  • mutt - all your emails
  • weechat - your precious private chats

Arguably, fetchers like apt, curl, fwupd, and wget rely on HTTPS for "authentication" more than secrecy, although apt has its own OpenPGP-based authentication so that wouldn't matter anyways. Still, this is truly distressing. And I haven't mentioned here things like gobby, network-manager, systemd, and others - the scope of this is broad. Hell, even good old lynx links against GnuTLS.

In our infrastructure, the magic command looks something like this:

cumin -o txt -p 0  'F:lsbdistcodename=buster' "apt-cache --installed rdepends libgnutls30 | grep '^ ' | sort -u" | tee gnutls-rdepds-per-host | awk '{print $NF}' | sort | uniq -c | sort -n

There, the result is even more worrisome, as those important packages seem to rely on GnuTLS for their transport security:

  • mariadb - all MySQL traffic and passwords
  • mandos - full disk encryption
  • slapd - LDAP passwords

mandos is especially distressing although it's probably not vulnerable because it seems it doesn't store the cleartext -- it's encrypted with the client's OpenPGP public key -- so the TLS tunnel never sees the cleartext either.

Other reports have also mentioned the following servers link against GnuTLS and could be vulnerable:

  • exim
  • rsyslog
  • samba
  • various VNC implementations

Not affected

Those programs are not affected by this vulnerability:

  • apache2
  • gnupg
  • python
  • nginx
  • openssh

This list is not exhaustive, naturally, but serves as an example of common software you don't need to worry about.

The vulnerability only exists in GnuTLS, as far as we know, so programs linking against other libraries are not vulnerable.

Because the vulnerability affects session tickets -- and those are set on the server side of the TLS connection -- only users of GnuTLS as a server are vulnerable. This means, for example, that while weechat uses GnuTLS, it will only suffer from the problem when acting as a server (which it does, in relay mode) or, of course, if the remote IRC server also uses GnuTLS. Same with apt, curl, wget, or git: it is unlikely to be a problem because it is only used as a client; the remote server is usually a webserver -- not git itself -- when using TLS.

Caveats

Keep in mind that it's not because a package links against GnuTLS that it uses it. For example, I have been told that, on Arch Linux, if both GnuTLS and OpenSSL are available, the mutt package will use the latter, so it's not affected. I haven't confirmed that myself nor have I checked on Debian.

Also, because it relies on session tickets, there's a time window after which the ticket gets cycled and properly initialized. But that is apparently 6 hours by default so it is going to protect only really long-lasting TLS sessions, which are uncommon, I would argue.

Update: according to the Tweet's author:

The author of this blog post is misinterpreting the problem. It's not the session ticket which is rotated after 6 hours, but the session ticket encryption key (STEK). This has nothing to do with the length of the TLS session, but rather the lifetime of the process using GnuTLS. For the first 6 hours, connections made to the GnuTLS server are vulnerable. After the process has been running for 6 hours, new connections are safe (assuming there's no other GnuTLS vulnerability). This reduces the impact of the vulnerability considerably (although it's still really bad).

I stand corrected.

My audit is limited. For example, it might have been better to walk the shared library dependencies directly, instead of relying on Debian package dependencies.

Other technical details

It seems the vulnerability might have been introduced in this merge request, itself following a (entirely reasonable) feature request to make it easier to rotate session tickets. The merge request was open for a few months and was thoroughly reviewed by a peer before being merged. Interestingly, the vulnerable function (_gnutls_initialize_session_ticket_key_rotation), explicitly says:

 * This function will not enable session ticket keys on the server side. That is done
 * with the gnutls_session_ticket_enable_server() function. This function just initializes
 * the internal state to support periodical rotation of the session ticket encryption key.

In other words, it thinks it is not responsible for session ticket initialization, yet it is. Indeed, the merge request fixing the problem unconditionally does this:

memcpy(session->key.initial_stek, key->data, key->size);

I haven't reviewed the code and the vulnerability in detail, so take the above with a grain of salt.

The full patch is available here. See also the upstream issue 1011, the upstream advisory, the Debian security tracker, and the Redhat Bugzilla.

Moving forward

The impact of this vulnerability depends on the affected packages and how they are used. It can range from "meh, someone knows I downloaded that Debian package yesterday" to "holy crap my full disk encryption passwords are compromised, I need to re-encrypt all my drives", including "I need to change all LDAP and MySQL passwords".

It promises to be a fun week for some people at least.

Looking ahead, however, one has to wonder whether we should follow @FiloSottile's advice and stop using GnuTLS altogether. There are at least a few programs that link against GnuTLS because of the OpenSSL licensing oddities but that has been first announced in 2015, then definitely and clearly resolved in 2017 -- or maybe that was in 2018? Anyways it's fixed, pinky-promise-I-swear, except if you're one of those weirdos still using GPL-2, of course. Even though OpenSSL isn't the simplest and secure TLS implementation out there, it could preferable to GnuTLS and maybe we should consider changing Debian packages to use it in the future.

But then again, the last time something like this happened, it was Heartbleed and GnuTLS wasn't affected, so who knows... It is likely that people don't have OpenSSL in mind when they suggest moving away from GnuTLS and instead think of other TLS libraries like mbedtls (previously known as PolarSSL), NSS, BoringSSL, LibreSSL and so on. Not that those are totally sinless either...

Correction, OpenSSL is actually what those people have in mind:

No, OpenSSL is exactly what we have in mind, including Filippo: https://twitter.com/FiloSottile/status/1270130358634283008

OpenSSL isn't perfect but it has improved considerably since Heartbleed and has the resources (funding and competent people) that a crypto project needs.

"This is fine", as they say...

16 November, 2020 04:03PM

hackergotchi for Adnan Hodzic

Adnan Hodzic

Degiro trading tracker – Simplified tracking of your investments

TL;DRVisit degiro-trading-tracker on Github I was always interested in stocks and investing. While I wanted to get into trading for long time, I could never...

The post Degiro trading tracker – Simplified tracking of your investments appeared first on FoolControl: Phear the penguin.

16 November, 2020 07:21AM by Adnan Hodzic

November 15, 2020

Jamie McClelland

Being your own Certificate Authority

There are many blogs and tutorials with nice shortcuts providing the necessary openssl commands to create and sign x509 certficates.

However, there is precious few instructions for how to easily create your own certificate authority.

You probably never want to do this in a production environment, but in a development environment it will make your life signficantly easier.

Create the certificate authority

Create the key and certificate

Pick a directory to store things in. Then, make your certificate authority key and certificate:

openssl genrsa -out cakey.pem 2048
openssl req -x509 -new -nodes -key cakey.pem -sha256 -days 1024 -out cacert.pem

Some tips:

  • You will be prompted to enter some information about your certificate authoirty. Provide the minimum information - i.e., only overwrite the defaults. So, provide a value for Country, State or Province, and Organization Name and leave the rest blank.
  • You probably want to leave the password blank if this is a development/testing environment.

Want to review what you created?

openssl x509 -text -noout -in cacert.pem 

Prepare your directory

You can create your own /etc/ssl/openssl.cnf file and really customize things. But I find it safer to use your distribution's default file so you can benefit from changes to it every time you upgrade.

If you do take the default file, you may have the dir option coded to demoCA (in Debian at least, maybe it's the upstream default too).

So, to avoid changing any configuration files, let's just use this value. Which means... you'll need to create that directory. The setting is relative - so you can create this directory in the same directory you have your keys.

mkdir demoCA

Lastly, you have to have a file that keeps track of your certificates. If it doesn't exist, you get an error:

touch demoCA/index.txt

That's it! Your certificate authority is ready to go.

Create a key and ceritificate signing request

First, pick your domain names (aka "common" names). For example, example.org and www.example.org.

Set those values in an environment variable. If you just have one:

export subjectAltName=DNS:example.org

If you have more then one:

export subjectAltName=DNS:example.org,DNS:www.example.org

Next, create a key and a certificate signing request:

openssl req -new -nodes -out new.csr -keyout new.key

Again, you will be prompted for some values (country, state, etc) - be sure to choose the same values you used with your certficiate authority! I honestly don't understand why this is necessary (when I set different values I get an error on the signing request step below). Maybe someone can add a comment to this post explaining why these values have to match?

Also, you must provide a common name for your certificate - you can choose the same name as the altSubjectNames value you set above (but just one domain).

Want to review what you created?

openssl req -in new.csr -text -noout 

Sign it!

At last the momenet we have been waiting for.

openssl ca -keyfile cakey.pem -cert cacert.pem -out new.crt -outdir . -rand_serial -infiles new.csr

Now yu have a new.crt and new.csr that you can install via your web browser, mail server, etc specification.

Sanity check it

This command will confirm that the certificate is trusted by your certificate authority.

openssl verify -no-CApath -CAfile cacert.pem new.crt 

But wait, there's still a question of trust

You probably want to tell your computer or browser that you want to trust your certificate signing authority.

Command line tools

Most tools in linux by default will trust all the certificates in /etc/ssl/certs/ca-certificates.crt. (If that file doesn't exist, try installing the ca-certificates package). If you want to add your certificate to that file:

cp cacert.pem /usr/local/share/ca-certificates/cacert.crt
sudo dpkg-reconfigure ca-certificates

Want to know what's funny? Ok, not really funny. If the certificate name ends with .pem the command above won't work. Seriously.

Once your certificate is installed with your web server you can now test to make sure it's all working with:

gnutls-cli --print-cert $domainName

Want a second opinion?

curl https://$domainName
wget https://$domainName -O-

Both will report errors if the certificate can't be verified by a system certificate.

If you really want to narrow down the cause of error (maybe reconfiguring ca-certificates didn't work)?

curl --cacert /path/to/your/cacert.pem --capath /tmp

Those arguments tell curl to use your certificate authority file and not to load any other certificate authority files (well, unless you have some installed in the temp directory).

Web browsers

Firefox and Chrome have their own store of trusted certificates - you'll have to import your cacert.pem file into each browser that you want to trust your key.

15 November, 2020 06:49PM

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Using Buypass card readers in Linux

If you want to know the result of your corona test in Norway (or really, any other health information), you'll need to either get an electronic ID with a confidential spec where your bank holds the secret key, can use it towards other banks with no oversight, and allows whoever has that key to take up huge loans and other binding agreements in your name in a matter of minutes (also known as “BankID”)… or you can get a smart card from Buypass, where you hold the private key yourself.

Since most browsers won't talk directly to a smart card, you'll need a small proxy that exposes a REST interface on 127.0.0.1. (It used to be solved with a Java applet, but, yeah. That was 40 Chrome releases ago.) Buypass publishes those only for Windows and Mac, but the protocol was simple enough, so I made my own reimplementation called Linux Dallas Multipass. It's rough, only really seems to work in Firefox (and only if you spoof your UA to be Windows), you'll need to generate and install certificates to install it yourself… but yes. You can log in to find out that you're negative.

15 November, 2020 11:15AM

Vincent Bernat

Zero-Touch Provisioning for Juniper

Juniper’s official documentation on ZTP explains how to configure the ISC DHCP Server to automatically upgrade and configure on first boot a Juniper device. However, the proposed configuration could be a bit more elegant. This note explains how.

TL;DR

Do not redefine option 43. Instead, specify the vendor option space to use to encode parameters with vendor-option-space.


When booting for the first time, a Juniper device requests its IP address through a DHCP discover message, then request additional parameters for autoconfiguration through a DHCP request message:

Dynamic Host Configuration Protocol (Request)
    Message type: Boot Request (1)
    Hardware type: Ethernet (0x01)
    Hardware address length: 6
    Hops: 0
    Transaction ID: 0x44e3a7c9
    Seconds elapsed: 0
    Bootp flags: 0x8000, Broadcast flag (Broadcast)
    Client IP address: 0.0.0.0
    Your (client) IP address: 0.0.0.0
    Next server IP address: 0.0.0.0
    Relay agent IP address: 0.0.0.0
    Client MAC address: 02:00:00:00:00:01 (02:00:00:00:00:01)
    Client hardware address padding: 00000000000000000000
    Server host name not given
    Boot file name not given
    Magic cookie: DHCP
    Option: (54) DHCP Server Identifier (10.0.2.2)
    Option: (55) Parameter Request List
        Length: 14
        Parameter Request List Item: (3) Router
        Parameter Request List Item: (51) IP Address Lease Time
        Parameter Request List Item: (1) Subnet Mask
        Parameter Request List Item: (15) Domain Name
        Parameter Request List Item: (6) Domain Name Server
        Parameter Request List Item: (66) TFTP Server Name
        Parameter Request List Item: (67) Bootfile name
        Parameter Request List Item: (120) SIP Servers
        Parameter Request List Item: (44) NetBIOS over TCP/IP Name Server
        Parameter Request List Item: (43) Vendor-Specific Information
        Parameter Request List Item: (150) TFTP Server Address
        Parameter Request List Item: (12) Host Name
        Parameter Request List Item: (7) Log Server
        Parameter Request List Item: (42) Network Time Protocol Servers
    Option: (50) Requested IP Address (10.0.2.15)
    Option: (53) DHCP Message Type (Request)
    Option: (60) Vendor class identifier
        Length: 15
        Vendor class identifier: Juniper-mx10003
    Option: (51) IP Address Lease Time
    Option: (12) Host Name
    Option: (255) End
    Padding: 00

It requests several options, including the TFTP server address option 150, and the Vendor-Specific Information Option 43—or VSIO. The DHCP server can use option 60 to identify the vendor-specific information to send. For Juniper devices, option 43 encodes the image name and the configuration file name. They are fetched from the IP address provided in option 150.

The official documentation on ZTP provides a valid configuration to answer such a request. However, it does not leverage the ability of the ISC DHCP Server to support several vendors and redefines option 43 to be Juniper-specific:

option NEW_OP-encapsulation code 43 = encapsulate NEW_OP;

Instead, it is possible to define an option space for Juniper, using a self-descriptive name, without overriding option 43:

# Juniper vendor option space
option space juniper;
option juniper.image-file-name     code 0 = text;
option juniper.config-file-name    code 1 = text;
option juniper.image-file-type     code 2 = text;
option juniper.transfer-mode       code 3 = text;
option juniper.alt-image-file-name code 4 = text;
option juniper.http-port           code 5 = text;

Then, when you need to set these suboptions, specify the vendor option space:

class "juniper-mx10003" {
  match if (option vendor-class-identifier = "Juniper-mx10003") {
  vendor-option-space juniper;
  option juniper.transfer-mode    "http";
  option juniper.image-file-name  "/images/junos-vmhost-install-mx-x86-64-19.3R2-S4.5.tgz";
  option juniper.config-file-name "/cfg/juniper-mx10003.txt";
}

This configuration returns the following answer:1

Dynamic Host Configuration Protocol (ACK)
    Message type: Boot Reply (2)
    Hardware type: Ethernet (0x01)
    Hardware address length: 6
    Hops: 0
    Transaction ID: 0x44e3a7c9
    Seconds elapsed: 0
    Bootp flags: 0x8000, Broadcast flag (Broadcast)
    Client IP address: 0.0.0.0
    Your (client) IP address: 10.0.2.15
    Next server IP address: 0.0.0.0
    Relay agent IP address: 0.0.0.0
    Client MAC address: 02:00:00:00:00:01 (02:00:00:00:00:01)
    Client hardware address padding: 00000000000000000000
    Server host name not given
    Boot file name not given
    Magic cookie: DHCP
    Option: (53) DHCP Message Type (ACK)
    Option: (54) DHCP Server Identifier (10.0.2.2)
    Option: (51) IP Address Lease Time
    Option: (1) Subnet Mask (255.255.255.0)
    Option: (3) Router
    Option: (6) Domain Name Server
    Option: (43) Vendor-Specific Information
        Length: 89
        Value: 00362f696d616765732f6a756e6f732d766d686f73742d69…
    Option: (150) TFTP Server Address
    Option: (255) End

Using vendor-option-space directive allows you to make different ZTP implementations coexist. For example, you can add the option space for PXE:

option space PXE;
option PXE.mtftp-ip    code 1 = ip-address;
option PXE.mtftp-cport code 2 = unsigned integer 16;
option PXE.mtftp-sport code 3 = unsigned integer 16;
option PXE.mtftp-tmout code 4 = unsigned integer 8;
option PXE.mtftp-delay code 5 = unsigned integer 8;
option PXE.discovery-control    code 6  = unsigned integer 8;
option PXE.discovery-mcast-addr code 7  = ip-address;
option PXE.boot-server code 8  = { unsigned integer 16, unsigned integer 8, ip-address };
option PXE.boot-menu   code 9  = { unsigned integer 16, unsigned integer 8, text };
option PXE.menu-prompt code 10 = { unsigned integer 8, text };
option PXE.boot-item   code 71 = unsigned integer 32;

class "pxeclients" {
  match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
  vendor-option-space PXE;
  option PXE.mtftp-ip 10.0.2.2;
  # […]
}

On the same topic, do not override option 125 “VIVSO.” See “Zero-Touch Provisioning for Cisco IOS.”


  1. Wireshark knows how to decode option 43 for some vendors, thanks to option 60, but not for Juniper. ↩︎

15 November, 2020 10:20AM by Vincent Bernat

Russ Allbery

PGP::Sign 1.04

The refactor of PGP::Sign in the 1.00 release to use IPC::Run instead of hand-rolled process management code broke signing large files, which I discovered when trying to use the new module to sign checkgroups for the Big Eight Usenet hierarchies.

There were two problems: IPC::Run sets sockets to talk to the child process to non-blocking, and when you pass a scalar in as the data to pass to a child socket, IPC::Run expects to use it as a queue and thus doesn't send EOF to the child process when the input is exhausted.

This release works around both problems by handling non-blocking writes to the child using select and using a socket to write the passphrase to the child process instead of a scalar variable. It also adds a test to ensure that signing long input keeps working.

You can get the latest release from CPAN or from the PGP::Sign distribution page.

15 November, 2020 12:05AM

November 14, 2020

hackergotchi for Junichi Uekawa

Junichi Uekawa

Rewrote my build system in C++.

Rewrote my build system in C++. I used to write build rules in Nodejs, but I figured if my projects are mostly C++ I should probably write them in C++. I wanted to make it a bit more like BUILD files but couldn't really and ended up looking more C++ than I wanted to. Seems like key-value struct initialization isn't available until C++20.

14 November, 2020 08:43AM by Junichi Uekawa

November 13, 2020

hackergotchi for Martin Michlmayr

Martin Michlmayr

beancount2ledger 1.3 released

I released version 1.3 of beancount2ledger, the beancount to ledger converter that was moved from bean-report ledger into a standalone tool.

You can get beancount2ledger from GitHub or via pip install.

Here are the changes in 1.3:

  • Add rounding postings only when required (issue #9)
  • Avoid printing too much precision for a currency (issue #21)
  • Avoid creating two or more postings with null amount (issue #23)
  • Add price to cost when needed by ledger (issue #22)
  • Preserve posting order (issue #18)
  • Add config option indent
  • Show metadata with hledger output
  • Support setting auxiliary dates and posting dates from metadata (issue #14)
  • Support setting the code of transactions from metadata
  • Support mapping of account and currency names (issue #24)
  • Improve documentation:
    • Add user guide
    • Document limitations (issue #12)

13 November, 2020 12:12PM by Martin Michlmayr

November 12, 2020

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

tidyCpp 0.0.2: More documentation and features

A first update of the still fairly new package tidyCpp is now on CRAN. The packages offers a C++ layer on top of the C API for R which aims to make its use a little easier and more consistent.

The vignette has been extended with a new examples, a new section and some general editing. A few new defines have been added mostly from the Rinternals.h header. We also replaced the Shield class with a simpler yet updated version class Protect. The name better represent the core functionality of offering a simpler alternative to the PROTECT and UNPROTECT macro pairing. We also added a short discussion to the vignette of a gotcha one has to be mindful of, and that we fell for ourselves in version 0.0.1. We also added a typedef so that code using Shield can still be used.

The NEWS entry follows.

Changes in tidyCpp version 0.0.2 (2020-11-12)

  • Expanded definitions in internals.h to support new example.

  • The vignette has been extended with an example based on package uchardet.

  • Class Shield has been replaced by an new class Protect; a compatibility typdef has been added.

  • The examples and vignette have been clarified with respect to proper ownership of protected objects; a new vignette section was added.

Thanks to my CRANberries, there is also a diffstat report for this release.

For questions, suggestions, or issues please use the issue tracker at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

12 November, 2020 03:03PM

hackergotchi for Bits from Debian

Bits from Debian

"Homeworld" will be the default theme for Debian 11

The theme "Homeworld" by Juliette Taka has been selected as default theme for Debian 11 'bullseye'. Juliette says that this theme has been inspired by the Bauhaus movement, an art style born in Germany in the 20th century.

Homeworld wallpaper. Click to see the whole theme proposal

Homeworld debian-installer theme. Click to see the whole theme proposal

After the Debian Desktop Team made the call for proposing themes, a total of eighteen choices have been submitted. The desktop artwork poll was open to the public, and we received 5,613 responses ranking the different choices, of which Homeworld has been ranked as the winner among them.

This is the third time that a submission by Juliette has won. Juliette is also the author of the lines theme that was used in Debian 8 and the softWaves theme that was used in Debian 9.

We'd like to thank all the designers that have participated and have submitted their excellent work in the form of wallpapers and artwork for Debian 11.

Congratulations, Juliette, and thank you very much for your contribution to Debian!

12 November, 2020 12:30PM by Jonathan Carter

Mike Hommey

Announcing git-cinnabar 0.5.6

Please partake in the git-cinnabar survey.

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.5?

  • Updated git to 2.29.2 for the helper.
  • git cinnabar git2hg and git cinnabar hg2git now have a --batch flag.
  • Fixed a few issues with experimental support for python 3.
  • Fixed compatibility issues with mercurial >= 5.5.
  • Avoid downloading unsupported clonebundles.
  • Provide more resilience to network problems during bundle download.
  • Prebuilt helper for Apple Silicon macos now available via git cinnabar download.

12 November, 2020 02:40AM by glandium

November 11, 2020

Vincent Fourmond

Solution for QSoas quiz #1: averaging spectra

This post describes the solution to the Quiz #1, based on the files found there. The point is to produce both the average and the standard deviation of a series of spectra. Below is how the final averaged spectra shoud look like:
I will present here two different solutions.

Solution 1: using the definition of standard deviation

There is a simple solution using the definition of the standard deviation: $$\sigma_y = \sqrt{<y^2> - {<y>}^2}$$ in which \(<y^2>\) is the average of \(y^2\) (and so on). So the simplest solution is to construct datasets with an additional column that would contain \(y^2\), average these columns, and replace the average with the above formula. For that, we need first a companion script that loads a single data file and adds a column with \(y^2\). Let's call this script load-one.cmds:
load ${1}
apply-formula y2=y**2 /extra-columns=1
flag /flags=processed
When this script is run with the name of a spectrum file as argument, it loads it (replaces ${1} by the first argument, the file name), adds a column y2 containing the square of the y column, and flag it with the processed flag. This is not absolutely necessary, but it makes it much easier to refer to all the spectra when they are processed. Then to process all the spectra, one just has to run the following commands:
run-for-each load-one.cmds Spectrum-1.dat Spectrum-2.dat Spectrum-3.dat
average flagged:processed
apply-formula y2=(y2-y**2)**0.5
dataset-options /yerrors=y2
The run-for-each command runs the load-one.cmds script for all the spectra (one could also have used Spectra-*.dat to not have to give all the file names). Then, the average averages the values of the columns over all the datasets. To be clear, it finds all the values that have the same X (or very close X values) and average them, column by column. The result of this command is therefore a dataset with the average of the original \(y\) data as y column and the average of the original \(y^2\) data as y2 column. So now, the only thing left to do is to use the above equation, which is done by the apply-formula code. The last command, dataset-options, is not absolutely necessary but it signals to QSoas that the standard error of the y column should be found in the y2 column. This is now available as script method-one.cmds in the git repository.

Solution 2: use QSoas's knowledge of standard deviation

The other method is a little more involved but it demonstrates a good approach to problem solving with QSoas. The starting point is that, in apply-formula, the value $stats.y_stddev corresponds to the standard deviation of the whole y column... Loading the spectra yields just a series of x,y datasets. We can contract them into a single dataset with one x column and several y columns:
load Spectrum-*.dat /flags=spectra
contract flagged:spectra
After these commands, the current dataset contains data in the form of:
lambda1	a1_1	a1_2	a1_3
lambda2	a2_1	a2_2	a2_3
...
in which the ai_1 come from the first file, ai_2 the second and so on. We need to use transpose to transform that dataset into:
0	a1_1	a2_1	...
1	a1_2	a2_2	...
2	a1_3	a2_3	...
In this dataset, values of the absorbance for the same wavelength for each dataset is now stored in columns. The next step is just to use expand to obtain a series of datasets with the same x column and a single y column (each corresponding to a different wavelength in the original data). The game is now to replace these datasets with something that looks like:
0	a_average
1	a_stddev
For that, one takes advantage of the $stats.y_average and $stats.y_stddev values in apply-formula, together with the i special variable that represents the index of the point:
apply-formula "if i == 0; then y=$stats.y_average; end; if i == 1; then y=$stats.y_stddev; end"
strip-if i>1
Then, all that is left is to apply this to all the datasets created by expand, which can be just made using run-for-datasets, and then, we reverse the splitting by using contract and transpose ! In summary, this looks like this. We need two files. The first, process-one.cmds contains the following code:
apply-formula "if i == 0; then y=$stats.y_average; end; if i == 1; then y=$stats.y_stddev; end"
strip-if i>1
flag /flags=processed
The main file, method-two.cmds looks like this:
load Spectrum-*.dat /flags=spectra
contract flagged:spectra
transpose
expand /flags=tmp
run-for-datasets process-one.cmds flagged:tmp
contract flagged:processed
transpose
dataset-options /yerrors=y2
Note some of the code above can be greatly simplified using new features present in the upcoming 3.0 version, but that is the topic for another post.

About QSoas

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code and compile it yourself or buy precompiled versions for MacOS and Windows there.

11 November, 2020 07:53PM by Vincent Fourmond (noreply@blogger.com)

Reproducible Builds

Reproducible Builds in October 2020

Welcome to the October 2020 report from the Reproducible Builds project.

In our monthly reports, we outline the major things that we have been up to over the past month. As a brief reminder, the motivation behind the Reproducible Builds effort is to ensure flaws have not been introduced in the binaries we install on our systems. If you are interested in contributing to the project, please visit our main website.

General

On Saturday 10th October, Morten Linderud gave a talk at Arch Conf Online 2020 on The State of Reproducible Builds in Arch. The video should be available later this month, but as a teaser:

The previous year has seen great progress in Arch Linux to get reproducible builds in the hands of the users and developers. In this talk we will explore the current tooling that allows users to reproduce packages, the rebuilder software that has been written to check packages and the current issues in this space.

During the Reproducible Builds summit in Marrakesh in 2019, developers from the GNU Guix, NixOS and Debian distributions were able to produce a bit-for-bit identical GNU Mes binary despite using three different versions of GCC. Since this summit, additional work resulted in a bit-for-bit identical Mes binary using tcc, and last month a fuller update was posted to this effect by the individuals involved. This month, however, David Wheeler updated his extensive page on Fully Countering Trusting Trust through Diverse Double-Compiling, remarking that:

GNU Mes rebuild is definitely an application of [Diverse Double-Compiling]. [..] This is an awesome application of DDC, and I believe it’s the first publicly acknowledged use of DDC on a binary

There was a small, followup discussion on our mailing list.

In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

This month, the Reproducible Builds project restarted our IRC meetings, managing to convene twice: the first time on October 12th (summary & logs), and later on the 26th (logs). As mentioned in previous reports, due to the unprecedented events throughout 2020, there will be no in-person summit event this year.

On our mailing list this month Elías Alejandro posted a request for help with a local configuration

In August, Lucas Nussbaum performed an archive-wide rebuild of packages to test enabling the reproducible=+fixfilepath Debian build flag by default. Enabling this fixfilepath feature will likely fix reproducibility issues in an estimated 500-700 packages. However, this month Vagrant Cascadian posted to the debian-devel mailing list:

It would be great to see the reproducible=+fixfilepath feature enabled by default in dpkg-buildflags, and we would like to proceed forward with this soon unless we hear any major concerns or other outstanding issues. […] We would like to move forward with this change soon, so please raise any concerns or issues not covered already.

Debian Developer Stuart Prescott has been improving python-debian, a Python library that is used to parse Debian-specific files such as changelogs, .dscs, etc. In particular, Stuart is working on adding support for .buildinfo files used for recording reproducibility-related build metadata:

This can mostly be a very thin layer around the existing Deb822 types, using the existing Changes code for the file listings, the existing PkgRelations code for the package listing and gpg_* functions for signature handling.

A total of 159 Debian packages were categorised, 69 had their categorisation updated, and 33 had their classification removed this month, adding to our knowledge about identified issues. As part of this, Chris Lamb identified and classified two new issues: build_path_captured_in_emacs_el_file and rollup_embeds_build_path.

Software development

This month, we tried to fix a large number of currently-unreproducible packages, including:

Bernhard M. Wiedemann also reported three issues against bison, ibus and postgresql12.

Tools

diffoscope is our in-depth and content-aware diff utility. Not only could you locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds too. This month, Chris Lamb uploaded version 161 to Debian (later backported by Mattia Rizzolo), as well as made the following changes:

  • Move test_ocaml to the assert_diff helper. []
  • Update tests to support OCaml version 4.11.1. Thanks to Sebastian Ramacher for the report. (#972518)
  • Bump minimum version of the Black source code formatter to 20.8b1. (#972518)

In addition, Jean-Romain Garnier temporarily updated the dependency on radare2 to ensure our test pipelines continue to work [], and for the GNU Guix distribution Vagrant Cascadian diffoscope to version 161 [].

In related development, trydiffoscope is the web-based version of diffoscope. This month, Chris Lamb made the following changes:

  • Mark a --help-only test as being a ‘superficial’ test. (#971506)
  • Add a real, albeit flaky, test that interacts with the try.diffoscope.org service. []
  • Bump debhelper compatibility level to 13 [] and bump Standards-Version to 4.5.0 [].

Lastly, disorderfs version 0.5.10-2 was uploaded to Debian unstable by Holger Levsen, which enabled security hardening via DEB_BUILD_MAINT_OPTIONS [] and dropped debian/disorderfs.lintian-overrides [].

Website and documentation

This month, a number of updates to the main Reproducible Builds website and related documentation were made by Chris Lamb:

  • Add a citation link to the academic article regarding dettrace [], and added yet another supply-chain security attack publication [].
  • Reformatted the Jekyll’s Liquid templating language and CSS formatting to be consistent [] as well as expand a number of tab characters [].
  • Used relative_url to fix missing translation icon on various pages. []
  • Published two announcement blog posts regarding the restarting of our IRC meetings. [][]
  • Added an explicit note regarding the lack of an in-person summit in 2020 to our events page. []

Testing framework

The Reproducible Builds project operates a Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, Holger Levsen made the following changes:

  • Debian-related changes:

    • Refactor and improve the Debian dashboard. [][][]
    • Track bugs which are usertagged as ‘filesystem’, ‘fixfilepath’, etc.. [][][]
    • Make a number of changes to package index pages. [][][]
  • System health checks:

    • Relax disk space warning levels. []
    • Specifically detect build failures reported by dpkg-buildpackage. []
    • Fix a regular expression to detect outdated package sets. []
    • Detect Lintian issues in diffoscope. []

  • Misc:

    • Make a number of updates to reflect that our sponsor Profitbricks has renamed itself to IONOS. [][][][]
    • Run a F-Droid maintenance routine twice a month to utilise its cleanup features. []
    • Fix the target name in OpenWrt builds to ath79 from ath97. []
    • Add a missing Postfix configuration for a node. []
    • Temporarily disable Arch Linux builds until a core node is back. []
    • Make a number of changes to our “thanks” page. [][][]

Build node maintenance was performed by both Holger Levsen [][] and Vagrant Cascadian [][][], Vagrant Cascadian also updated the page listing the variations made when testing to reflect changes for in build paths [] and Hans-Christoph Steiner made a number of changes for F-Droid, the free software app repository for Android devices, including:

  • Do not fail reproducibility jobs when their cleanup tasks fail. []
  • Skip libvirt-related sudo command if we are not actually running libvirt. []
  • Use direct URLs in order to eliminate a useless HTTP redirect. []

If you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. However, you can also get in touch with us via:

11 November, 2020 02:35PM

November 10, 2020

hackergotchi for Daniel Pocock

Daniel Pocock

Withdrawing my nomination for Fedora Council

The deadline for Fedora Council nominations is tomorrow, 11 November. I've decided to withdraw my nomination and I'd like to do so publicly so that anybody else who wants to contest the position has an opportunity to hastily submit a nomination in the hours that remain before the deadline.

Even without going as far as the voting stage, I believe my brief campaign has already achieved some remarkable things.

First of all, my email nominating myself on the Fedora Council list was delayed for approximately 14 hours.

Received: from mailman01.iad2.fedoraproject.org (localhost [IPv6:::1])
 by mailman01.iad2.fedoraproject.org (Postfix) with ESMTP id 7505B7605D96F;
 Thu, 29 Oct 2020 12:54:59 +0000 (UTC)
Received: by mailman01.iad2.fedoraproject.org (Postfix, from userid 991)
 id 4F9CF7607184F; Wed, 28 Oct 2020 22:52:32 +0000 (UTC)

This strikes me as a strong hint that I'm an independent candidate who is not afraid to ask the important questions. If I was to suggest the election was about to be rigged then I would risk looking like the world's biggest clown but nonetheless, messages delayed like that makes it feel like the playing field is not quite level. It is an all-too-familiar feeling in the free software world. This was an important revelation about transparency achieved without even getting to a vote.

Secondly, there has been increased attention on the difference between real harassment as opposed to the counter-accusations that appear all too frequently from leaders covering up their own rogue behavior. Harvey Weinstein's lawyer publicly blamed his victims and the outgoing US president claimed to be a victim of lynching. Leaders of certain free software organizations cry the same crocodile tears. Let us remember the original words chosen by young women coming into our environment are the only genuine examples of harassment. With or without a vote, I remain committed to allocating a share of my time to courageous volunteers like this. Their willingness to speak up gives me hope that the leaders of tomorrow may be better than the office holders of yesterday.

The ultimate achievement of my short campaign is to depart gracefully. I couldn't think of a better time to do so. This is what leadership looks like. Either you understand that or you don't.

I didn't nominate simply to resign. While the censorship of my nomination email was unpleasant, I'm about to take on another responsibility and ensure I prioritize correctly and ensure the right balance exists between my free and open source software activities. I continue to dedicate time to the OpenPOWER / Talos II platform, the Domoticz, Zigate and Zigbee integration and my bread-and-butter, free real-time communications, which is more important than ever right now.

I wish all the candidates good luck.

2020 Melbourne Cup

10 November, 2020 08:50PM

Thorsten Alteholz

My Debian Activities in October 2020

FTP master

This month I accepted 208 packages and rejected 29. The overall number of packages that got accepted was 563, so yeah, I was not alone this month :-).

Anyway, this month marked another milestone in my NEW package handling. My overall number of ACCEPTed package exceeded the magic number of 20000 packages. This is almost 30% of all packages accepted in Debian. I am a bit proud of this achievement.

Debian LTS

This was my seventy-sixth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 20.75h. During that time I did LTS uploads of:

  • [DLA 2415-1] freetype security update for one CVE
  • [DLA 2419-1] dompurify.js security update for two CVEs
  • [DLA 2418-1] libsndfile security update for eight CVEs
  • [DLA 2421-1] cimg security update for eight CVEs

I also started to work on golang-1.7 and golang-1.8

Last but not least I did some days of frontdesk duties.

Debian ELTS

This month was the twenty eighth ELTS month.

During my allocated time I uploaded:

  • ELA-289-2 for python3.4
  • ELA-304-1 for freetype
  • ELA-305-1 for libsndfile

The first upload of python3.4, last month, did not build on armel, so I had to reupload an improved package this month. For amd64 and i386 the ELTS packages are built in native mode, whereas the packages on armel are cross-built. There is some magic in debian/rules of python to detect in which mode the package is built. This is important as some tests of the testsuite are not really working in cross-build-mode. Unfortunately I had to learn this the hard way …

The upload of libsndfile now aligns the number of fixed CVEs in all releases.

Last but not least I did some days of frontdesk duties.

Other stuff

Despite my NEW-handling and LTS/ELTS stuff I hadn’t much fun with Debian packages this month. Given the approaching freeze, I hope this will change again in November.

10 November, 2020 02:48PM by alteholz

hackergotchi for Jonathan Dowland

Jonathan Dowland

Borg, confidence in backups, GtkPod and software preservation

Over the summer I decided to migrate my backups from rdiff-backup to borg, which offers some significant advantages, in particular de-duplication, but comes at a cost of complexity, and a corresponding sense of unease about how sound my backup strategy might be. I've now hit the Point Of No Return: my second external backup drive is overdue being synced with my NAS, which will delete the last copy of the older rdiff-backup backups.

Whilst I hesitate over this last action to commit to borg, something else happened. My wife wanted to put a copy of her iTunes music library on her new phone, and I couldn't find it: not only could I not find it on any of our computers, I also couldn't find a copy on the NAS, or in backups, or even in old DVD-Rs. This has further knocked my confidence in our family data management, and makes me even more nervous to commit to borg. I'm now wondering about stashing the contents of the second external backup disk on some cloud service as a fail-safe.

There was one known-good copy of Sarah's music: on her ancient iPod Nano. Apple have gone to varying lengths to prevent you from copying music from an iPod. When Music is copied to an iPod, the files are stripped of all their metadata (artist, title, album, etc.) and renamed to something non-identifying (e.g. F01/MNRL.m4a), and the metadata (and correlation to the obscure file name) is saved in separate database files. The partition of the flash drive containing all this is also marked as "hidden" to prevent it appearing on macOS and Windows systems. We are lucky that the iPod is so old, because Apple went even further in more recent models, adding a layer of encryption.

To get the music off the iPod, one has to undo all of these steps.

Luckily, other fine folks have worked out reversing all these steps and implemented it in software such as libgpod and its frontend, GtkPod, which is still currently available as a Debian package. It mostly worked, and I got back 95% of the tracks. (It would have been nice if GtkPod had reported the tracks it hadn't recovered, it was aware they existed based on the errors it did print. But you can't have everything.)

GtkPod is a quirky, erratic piece of software, that is only useful for old Apple equipment that is long out of production, prior to the introduction of the encryption. The upstream homepage is dead, and I suspect it is unmaintained. The Debian package is orphaned. It's been removed from testing, because it won't build with GCC 10. On the other hand, my experience shows that it worked, and was useful for a real problem that someone had today.

I'm in two minds about GtkPod's fate. On the one hand, I think Debian has far too many packages, with a corresponding burden of maintenance responsibility (for the whole project, not just the individual package maintainers), and there's a quality problem: once upon a time, if software had been packaged in a distribution like Debian, that was a mark of quality, a vote of confidence, and you could have some hope that the software would work and integrate well with the rest of the system. That is no longer true, and hasn't been in my experience for many years. If we were more discerning about what software we included in the distribution, and what we kept, perhaps we could be a leaner distribution, faster to adapt to the changing needs in the world, and of a higher quality.

On the other hand, this story about GtkPod is just one of many similar stories. Real problems have been solved in open source software, and computing historians, vintage computer enthusiasts, researchers etc. can still benefit from that software long into the future. Throwing out all this stuff in the name of "progress", could be misguided. I'm especially sad when I see the glee which people have expressed when ditching libraries like Qt4 from the archive. Some software will not be ported on to Qt5 (or Gtk3, Qt6, Gtk4, Qt7, etc., in perpetuity). Such software might be all of: unmaintained, "finished", and useful for some purpose (however niche), all at the same time.

10 November, 2020 11:01AM

Red Hat at the Turing Institute

In Summer 2019 Red Hat were invited to the Turing Institute to provide a workshop on issues around building and sustaining an Open Source community. I was part of a group of about 6 people to visit the Turing and deliver the workshop. It seemed to have been well received by the audience.

The Turing Institute is based within the British Library. For many years I have enjoyed visiting the British Library if I was visiting or passing through London for some reason or other: it's such a lovely serene space in a busy, hectic part of London. On one occasion they had Jack Kerouac's manuscript for "On The Road" on display in one of the public gallery spaces: it's a continuous 120-foot long piece of paper that Kerouac assembled to prevent the interruption of changing sheets of paper in his typewriter from disturbing his flow whilst writing.

The Institute itself is a really pleasant-looking working environment. I got a quick tour of it back in February 2019 when visiting a friend who worked there, but last year's visit was my first prolonged experience of working there. (I also snuck in this February, when passing through London, to visit my supervisor who is a Turing Fellow)

I presented a section of a presentation entitled "How to build a successful Open Source community". My section attempted to focus on the "how". We've put out all the presentations under a Creative Commons license, and we've published them on the Red Hat Research website: https://research.redhat.com/blog/2020/08/12/open-source-at-the-turing/

The workshop participants were drawn from PhD students, research associates, research software engineers and Turing Institute fellows. We had some really great feedback from them which we've fed back into revisions of the workshop material including the presentations.

I'm hoping to stay involve in further collaborations between the Turing and Red Hat. I'm pleased to say that we participated in a recent Tools, practices and systems seminar (although I was not involved).

10 November, 2020 09:54AM

hackergotchi for Sean Whitton

Sean Whitton

Combining repeat and repeat-complex-command

In Emacs, you can use C-x z to repeat the last command you input, and subsequently you can keep tapping the ‘z’ key to execute that command again and again. If the command took minibuffer input, however, you’ll be asked for that input again. For example, suppose you type M-z : to delete through the next colon character. If you want to keep going and delete through the next few colons, you would need to use C-x z : z : z : etc. which is pretty inconvenient. So there’s also C-x ESC ESC RET or C-x M-: RET, which will repeat the last command which took minibuffer input, as if you’d given it the same minibuffer input. So you could use M-z : C-x M-: RET C-x M-: RET etc., but then you might as well just keep typing M-z : over and over. It’s also quite inconvenient to have to remember whether you need to use C-x z or C-x M-: RET.

I wanted to come up with a single command which would choose the correct repetition method. It turns out it’s a bit involved, but here’s what I came up with. You can use this under the GPL-3 or any later version published by the FSF. Assumes lexical binding is turned on for the file you have this in.

;; Adapted from `repeat-complex-command' as of November 2020
(autoload 'repeat-message "repeat")
(defun spw/repeat-complex-command-immediately (arg)
  "Like `repeat-complex-command' followed immediately by RET."
  (interactive "p")
  (if-let ((newcmd (nth (1- arg) command-history)))
      (progn
        (add-to-history 'command-history newcmd)
        (repeat-message "Repeating %S" newcmd)
        (apply #'funcall-interactively
               (car newcmd)
               (mapcar (lambda (e) (eval e t)) (cdr newcmd))))
    (if command-history
        (error "Argument %d is beyond length of command history" arg)
      (error "There are no previous complex commands to repeat"))))

(let (real-last-repeatable-command)
  (defun spw/repeat-or-repeat-complex-command-immediately ()
    "Call `repeat' or `spw/repeat-complex-command-immediately' as appropriate.

Note that no prefix argument is accepted because this has
different meanings for `repeat' and for
`spw/repeat-complex-command-immediately', so that might cause surprises."
    (interactive)
    (if (eq last-repeatable-command this-command)
        (setq last-repeatable-command real-last-repeatable-command)
      (setq real-last-repeatable-command last-repeatable-command))
    (if (eq last-repeatable-command (caar command-history))
        (spw/repeat-complex-command-immediately 1)
      (repeat nil))))

;; `suspend-frame' is bound to both C-x C-z and C-z
(global-set-key "\C-z" #'spw/repeat-or-repeat-complex-command-immediately)

10 November, 2020 06:21AM

November 09, 2020

hackergotchi for Joachim Breitner

Joachim Breitner

Distributing Haskell programs in a multi-platform zip file

My maybe most impactful piece of code is tttool and the surrounding project, which allows you to create your own content for the Ravensburger Tiptoi™ platform. The program itself is a command line tool, and in this blog post I want to show how I go about building that program for Linux (both normal and static builds), Windows (cross-compiled from Linux), OSX (only on CI), all combined into and released as a single zip file.

Maybe some of it is useful or inspiring to my readers, or can even serve as a template. This being a blob post, though, note that it may become obsolete or outdated.

Ingredients

I am building on the these components:

Without the nix build system and package manger I probably woudn’t even attempt to pull of complex tasks that may, say, require a patched ghc. For many years I resisted learning about nix, but when I eventually had to, I didn’t want to go back.

This project provides an alternative Haskell build infrastructure for nix. While this is not crucial for tttool, it helps that they tend to have some cross-compilation-related patches more than the official nixpkgs. I also like that it more closely follows the cabal build work-flow, where cabal calculates a build plan based on your projects dependencies. It even has decent documentation (which is a new thing compared to two years ago).

Niv is a neat little tool to keep track of your dependencies. You can quickly update them with, say niv update nixpkgs. But what’s really great is to temporarily replace one of your dependencies with a local checkout, e.g. via NIV_OVERRIDE_haskellNix=$HOME/build/haskell/haskell.nix nix-instantiate -A osx-exe-bundle There is a Github action that will keep your niv-managed dependencies up-to-date.

This service (proprietary, but free to public stuff up to 10GB) gives your project its own nix cache. This means that build artifacts can be cached between CI builds or even build steps, and your contributors. A cache like this is a must if you want to use nix in more interesting ways where you may end up using, say, a changed GHC compiler. Comes with GitHub actions integration.

  • CI via Github actions

Until recently, I was using Travis, but Github actions are just a tad easier to set up and, maybe more important here, the job times are high enough that you can rebuild GHC if you have to, and even if your build gets canceled or times out, cleanup CI steps still happen, so that any new nix build products will still reach your nix cache.

The repository setup

All files discussed in the following are reflected at https://github.com/entropia/tip-toi-reveng/tree/7020cde7da103a5c33f1918f3bf59835cbc25b0c.

We are starting with a fairly normal Haskell project, with a single .cabal file (but multi-package projects should work just fine). To make things more interesting, I also have a cabal.project which configures one dependency to be fetched via git from a specific fork.

To start building the nix infrastructure, we can initialize niv and configure it to use the haskell.nix repo:

niv init
niv add input-output-hk/haskell.nix -n haskellNix

This creates nix/sources.json (which you can also edit by hand) and nix/sources.nix (which you can treat like a black box).

Now we can start writing the all-important default.nix file, which defines almost everything of interest here. I will just go through it line by line, and explain what I am doing here.

{ checkMaterialization ? false }:

This defines a flag that we can later set when using nix-build, by passing --arg checkMaterialization true, and which is off by default. I’ll get to that flag later.

let
  sources = import nix/sources.nix;
  haskellNix = import sources.haskellNix {};

This imports the sources as defined niv/sources.json, and loads the pinned revision of the haskell.nix repository.

  # windows crossbuilding with ghc-8.10 needs at least 20.09.
  # A peek at https://github.com/input-output-hk/haskell.nix/blob/master/ci.nix can help
  nixpkgsSrc = haskellNix.sources.nixpkgs-2009;
  nixpkgsArgs = haskellNix.nixpkgsArgs;

  pkgs = import nixpkgsSrc nixpkgsArgs;

Now we can define pkgs, which is “our” version of the nixpkgs package set, extended with the haskell.nix machinery. We rely on haskell.nix to pin of a suitable revision of the nixpkgs set (see how we are using their niv setup).

Here we could our own configuration, overlays, etc to nixpkgsArgs. In fact, we do in

  pkgs-osx = import nixpkgsSrc (nixpkgsArgs // { system = "x86_64-darwin"; });

to get the nixpkgs package set of an OSX machine.

  # a nicer filterSource
  sourceByRegex =
    src: regexes: builtins.filterSource (path: type:
      let relPath = pkgs.lib.removePrefix (toString src + "/") (toString path); in
      let match = builtins.match (pkgs.lib.strings.concatStringsSep "|" regexes); in
      ( type == "directory"  && match (relPath + "/") != null
      || match relPath != null)) src;

Next I define a little helper that I have been copying between projects, and which allows me to define the input to a nix derivation (i.e. a nix build job) with a set of regexes. I’ll use that soon.

  tttool-exe = pkgs: sha256:
    (pkgs.haskell-nix.cabalProject {

The cabalProject function takes a cabal project and turns it into a nix project, running cabal v2-configure under the hood to let cabal figure out a suitable build plan. Since we want to have multiple variants of the tttool, this is so far just a function of two arguments pkgs and sha256, which will be explained in a bit.

      src = sourceByRegex ./. [
          "cabal.project"
          "src/"
          "src/.*/"
          "src/.*.hs"
          ".*.cabal"
          "LICENSE"
        ];

The cabalProject function wants to know the source of the Haskell projects. There are different ways of specifying this; in this case I went for a simple whitelist approach. Note that cabal.project.freze (which exists in the directory) is not included.

      # Pinning the input to the constraint solver
      compiler-nix-name = "ghc8102";

The cabal solver doesn’t find out which version of ghc to use, that is still my choice. I am using GHC-8.10.2 here. It may require a bit of experimentation to see which version works for your project, especially when cross-compiling to odd targets.

      index-state = "2020-11-08T00:00:00Z";

I want the build to be deterministic, and not let cabal suddenly pick different package versions just because something got uploaded. Therefore I specify which snapshot of the Hackage package index it should consider.

      plan-sha256 = sha256;
      inherit checkMaterialization;

Here we use the second parameter, but I’ll defer the explanation for a bit.

      modules = [{
        # smaller files
        packages.tttool.dontStrip = false;
      }] ++

These “modules” are essentially configuration data that is merged in a structural way. Here we say that we want the tttool binary to be stripped (saves a few megabyte).

      pkgs.lib.optional pkgs.hostPlatform.isMusl {
        packages.tttool.configureFlags = [ "--ghc-option=-static" ];

Also, when we are building on the musl platform, that’s when we want to produce a static build, so let’s pass -static to GHC. This seems to be enough in terms of flags to produce static binaries. It helps that my project is using mostly pure Haskell libraries; if you link against C libraries you might have to jump through additional hoops to get static linking going. The haskell.nix documentation has a section on static building with some flags to cargo-cult.

        # terminfo is disabled on musl by haskell.nix, but still the flag
        # is set in the package plan, so override this
        packages.haskeline.flags.terminfo = false;
      };

This (again only used when the platform is musl) seems to be necessary to workaround what might be a big in haskell.nix.

    }).tttool.components.exes.tttool;

The cabalProject function returns a data structure with all Haskell packages of the project, and for each package the different components (libraries, tests, benchmarks and of course executables). We only care about the tttool executable, so let’s project that out.

  osx-bundler = pkgs: tttool:
   pkgs.stdenv.mkDerivation {
      name = "tttool-bundle";

      buildInputs = [ pkgs.macdylibbundler ];

      builder = pkgs.writeScript "tttool-osx-bundler.sh" ''
        source ${pkgs.stdenv}/setup

        mkdir -p $out/bin/osx
        cp ${tttool}/bin/tttool $out/bin/osx
        chmod u+w $out/bin/osx/tttool
        dylibbundler \
          -b \
          -x $out/bin/osx/tttool \
          -d $out/bin/osx \
          -p '@executable_path' \
          -i /usr/lib/system \
          -i ${pkgs.darwin.Libsystem}/lib
      '';
    };

This function, only to be used on OSX, takes a fully build tttool, finds all the system libraries it is linking against, and copies them next to the executable, using the nice macdylibbundler. This way we can get a self-contained executable.

A nix expert will notice that this probably should be written with pkgs.runCommandNoCC, but then dylibbundler fails because it lacks otool. This should work eventually, though.

in rec {
  linux-exe      = tttool-exe pkgs
     "0rnn4q0gx670nzb5zp7xpj7kmgqjmxcj2zjl9jqqz8czzlbgzmkh";
  windows-exe    = tttool-exe pkgs.pkgsCross.mingwW64
     "01js5rp6y29m7aif6bsb0qplkh2az0l15nkrrb6m3rz7jrrbcckh";
  static-exe     = tttool-exe pkgs.pkgsCross.musl64
     "0gbkyg8max4mhzzsm9yihsp8n73zw86m3pwvlw8170c75p3vbadv";
  osx-exe        = tttool-exe pkgs-osx
     "0rnn4q0gx670nzb5zp7xpj7kmgqjmxcj2zjl9jqqz8czzlbgzmkh";

Time to create the four versions of tttool. In each case we use the tttool-exe function from above, passing the package set (pkgs,…) and a SHA256 hash.

The package set is either the normal one, or it is one of those configured for cross compilation, building either for Windows or for Linux using musl, or it is the OSX package set that we instantiated earlier.

The SHA256 hash describes the result of the cabal plan calculation that happens as part of cabalProject. By noting down the expected result, nix can skip that calculation, or fetch it from the nix cache etc.

How do we know what number to put there, and when to change it? That’s when the --arg checkMaterialization true flag comes into play: When that is set, cabalProject will not blindly trust these hashes, but rather re-calculate them, and tell you when they need to be updated. We’ll make sure that CI checks them.

  osx-exe-bundle = osx-bundler pkgs-osx osx-exe;

For OSX, I then run the output through osx-bundler defined above, to make it independent of any library paths in /nix/store.

This is already good enough to build the tool for the various systems! The rest of the the file is related to packaging up the binaries, to tests, and various other things, but nothing too essentially. So if you got bored, you can more or less stop now.

  static-files = sourceByRegex ./. [
    "README.md"
    "Changelog.md"
    "oid-decoder.html"
    "example/.*"
    "Debug.yaml"
    "templates/"
    "templates/.*\.md"
    "templates/.*\.yaml"
    "Audio/"
    "Audio/digits/"
    "Audio/digits/.*\.ogg"
  ];

  contrib = ./contrib;

The final zip file that I want to serve to my users contains a bunch of files from throughout my repository; I collect them here.

  book = …;

The project comes with documentation in the form of a Sphinx project, which we build here. I’ll omit the details, because they are not relevant for this post (but of course you can peek if you are curious).

  os-switch = pkgs.writeScript "tttool-os-switch.sh" ''
    #!/usr/bin/env bash
    case "$OSTYPE" in
      linux*)   exec "$(dirname "''${BASH_SOURCE[0]}")/linux/tttool" "$@" ;;
      darwin*)  exec "$(dirname "''${BASH_SOURCE[0]}")/osx/tttool" "$@" ;;
      msys*)    exec "$(dirname "''${BASH_SOURCE[0]}")/tttool.exe" "$@" ;;
      cygwin*)  exec "$(dirname "''${BASH_SOURCE[0]}")/tttool.exe" "$@" ;;
      *)        echo "unsupported operating system $OSTYPE" ;;
    esac
  '';

The zipfile should provide a tttool command that works on all systems. To that end, I implement a simple platform switch using bash. I use pks.writeScript so that I can include that file directly in default.nix, but it would have been equally reasonable to just save it into nix/tttool-os-switch.sh and include it from there.

  release = pkgs.runCommandNoCC "tttool-release" {
    buildInputs = [ pkgs.perl ];
  } ''
    # check version
    version=$(${static-exe}/bin/tttool --help|perl -ne 'print $1 if /tttool-(.*) -- The swiss army knife/')
    doc_version=$(perl -ne "print \$1 if /VERSION: '(.*)'/" ${book}/book.html/_static/documentation_options.js)

    if [ "$version" != "$doc_version" ]
    then
      echo "Mismatch between tttool version \"$version\" and book version \"$doc_version\""
      exit 1
    fi

Now the derivation that builds the content of the release zip file. First I double check that the version number in the code and in the documentation matches. Note how ${static-exe} refers to a path with the built static Linux build, and ${book} the output of the book building process.

    mkdir -p $out/
    cp -vsr ${static-files}/* $out
    mkdir $out/linux
    cp -vs ${static-exe}/bin/tttool $out/linux
    cp -vs ${windows-exe}/bin/* $out/
    mkdir $out/osx
    cp -vsr ${osx-exe-bundle}/bin/osx/* $out/osx
    cp -vs ${os-switch} $out/tttool
    mkdir $out/contrib
    cp -vsr ${contrib}/* $out/contrib/
    cp -vsr ${book}/* $out
  '';

The rest of the release script just copies files from various build outputs that we have defined so far.

Note that this is using both static-exe (built on Linux) and osx-exe-bundle (built on Mac)! This means you can only build the release if you either have setup a remote osx builder (a pretty nifty feature of nix, which I unfortunately can’t use, since I don't have access to a Mac), or the build product must be available in a nix cache (which it is in my case, as I will explain later).

The output of this derivation is a directory with all the files I want to put in the release.

  release-zip = pkgs.runCommandNoCC "tttool-release.zip" {
    buildInputs = with pkgs; [ perl zip ];
  } ''
    version=$(bash ${release}/tttool --help|perl -ne 'print $1 if /tttool-(.*) -- The swiss army knife/')
    base="tttool-$version"
    echo "Zipping tttool version $version"
    mkdir -p $out/$base
    cd $out
    cp -r ${release}/* $base/
    chmod u+w -R $base
    zip -r $base.zip $base
    rm -rf $base
  '';

And now these files are zipped up. Note that this automatically determines the right directory name and basename for the zipfile.

This concludes the step necessary for a release.

  gme-downloads = …;
  tests = …;

These two definitions in default.nix are related to some simple testing, and again not relevant for this post.

  cabal-freeze = pkgs.stdenv.mkDerivation {
    name = "cabal-freeze";
    src = linux-exe.src;
    buildInputs = [ pkgs.cabal-install linux-exe.env ];
    buildPhase = ''
      mkdir .cabal
      touch .cabal/config
      rm cabal.project # so that cabal new-freeze does not try to use HPDF via git
      HOME=$PWD cabal new-freeze --offline --enable-tests || true
    '';
    installPhase = ''
      mkdir -p $out
      echo "-- Run nix-shell -A check-cabal-freeze to update this file" > $out/cabal.project.freeze
      cat cabal.project.freeze >> $out/cabal.project.freeze
    '';
  };

Above I mentioned that I still would like to be able to just run cabal, and ideally it should take the same library versions that the nix-based build does. But pinning the version of ghc in cabal.project is not sufficient, I also need to pin the precise versions of the dependencies. This is best done with a cabal.project.freeze file.

The above derivation runs cabal new-freeze in the environment set up by haskell.nix and grabs the resulting cabal.project.freeze. With this I can run nix-build -A cabal-freeze and fetch the file from result/cabal.project.freeze and add it to the repository.

  check-cabal-freeze = pkgs.runCommandNoCC "check-cabal-freeze" {
      nativeBuildInputs = [ pkgs.diffutils ];
      expected = cabal-freeze + /cabal.project.freeze;
      actual = ./cabal.project.freeze;
      cmd = "nix-shell -A check-cabal-freeze";
      shellHook = ''
        dest=${toString ./cabal.project.freeze}
        rm -f $dest
        cp -v $expected $dest
        chmod u-w $dest
        exit 0
      '';
    } ''
      diff -r -U 3 $actual $expected ||
        { echo "To update, please run"; echo "nix-shell -A check-cabal-freeze"; exit 1; }
      touch $out
    '';

But generated files in repositories are bad, so if that cannot be avoided, at least I want a CI job that checks if they are up to date. This job does that. What’s more, it is set up so that if I run nix-shell -A check-cabal-freeze it will update the file in the repository automatically, which is much more convenient than manually copying.

Lately, I have been using this pattern regularly when adding generated files to a repository: * Create one nix derivation that creates the files * Create a second derivation that compares the output of that derivation against the file in the repo * Create a derivation that, when run in nix-shell, updates that file. Sometimes that derivation is its own file (so that I can just run nix-shell nix/generate.nix), or it is merged into one of the other two.

This concludes the tour of default.nix.

The CI setup

The next interesting bit is the file .github/workflows/build.yml, which tells Github Actions what to do:

name: "Build and package"
on:
  pull_request:
  push:

Standard prelude: Run the jobs in this file upon all pushes to the repository, and also on all pull requests. Annoying downside: If you open a PR within your repository, everything gets built twice. Oh well.

jobs:
  build:
    strategy:
      fail-fast: false
      matrix:
        include:
        - target: linux-exe
          os: ubuntu-latest
        - target: windows-exe
          os: ubuntu-latest
        - target: static-exe
          os: ubuntu-latest
        - target: osx-exe-bundle
          os: macos-latest
    runs-on: ${{ matrix.os }}

The “build” job is a matrix job, i.e. there are four variants, one for each of the different tttool builds, together with an indication of what kind of machine to run this on.

    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12

We begin by checking out the code and installing nix via the install-nix-action.

    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'

Then we configure our Cachix cache. This means that this job will use build products from the cache if possible, and it will also push new builds to the cache. This requires a secret key, which you get when setting up your Cachix cache. See the nix and Cachix tutorial for good instructions.

    - run: nix-build --arg checkMaterialization true -A ${{ matrix.target }}

Now we can actually run the build. We set checkMaterialization to true so that CI will tell us if we need to update these hashes.

    # work around https://github.com/actions/upload-artifact/issues/92
    - run: cp -RvL result upload
    - uses: actions/upload-artifact@v2
      with:
        name: tttool (${{ matrix.target }})
        path: upload/

For convenient access to build products, e.g. from pull requests, we store them as Github artifacts. They can then be downloaded from Github’s CI status page.

  test:
    runs-on: ubuntu-latest
    needs: build
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'
    - run: nix-build -A tests

The next job repeats the setup, but now runs the tests. Because of needs: build it will not start before the builds job has completed. This also means that it should get the actual tttool executable to test from our nix cache.

  check-cabal-freeze:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'
    - run: nix-build -A check-cabal-freeze

The same, but now running the check-cabal-freeze test mentioned above. Quite annoying to repeat the setup instructions for each job…

  package:
    runs-on: ubuntu-latest
    needs: build
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'

    - run: nix-build -A release-zip

    - run: unzip -d upload ./result/*.zip
    - uses: actions/upload-artifact@v2
      with:
        name: Release zip file
        path: upload

Finally, with the same setup, but slightly different artifact upload, we build the release zip file. Again, we wait for build to finish so that the built programs are in the nix cache. This is especially important since this runs on linux, so it cannot build the OSX binary and has to rely on the cache.

Note that we don’t need to checkMaterialization again.

Annoyingly, the upload-artifact action insists on zipping the files you hand to it. A zip file that contains just a zipfile is kinda annoying, so I unpack the zipfile here before uploading the contents.

Conclusion

With this setup, when I do a release of tttool, I just bump the version numbers, wait for CI to finish building, run nix-build -A release-zip and upload result/tttool-n.m.zip. A single file that works on all target platforms. I have not yet automated making the actual release, but with one release per year this is fine.

Also, when trying out a new feature, I can easily create a branch or PR for that and grab the build products from Github’s CI, or ask people to try them out (e.g. to see if they fixed their bugs). Note, though, that you have to sign into Github before being able to download these artifacts.

One might think that this is a fairly hairy setup – finding the right combinations of various repertories so that cross-compilation works as intended. But thanks to nix’s value propositions, this does work! The setup presented here was a remake of a setup I did two years ago, with a much less mature haskell.nix. Back then, I committed a fair number of generated files to git, and juggled more complex files … but once it worked, it kept working for two years. I was indeed insulated from upstream changes. I expect that this setup will also continue to work reliably, until I choose to upgrade it again. Hopefully, then things are even more simple, and require less work-around or manual intervention.

09 November, 2020 07:42PM by Joachim Breitner (mail@joachim-breitner.de)

November 08, 2020

Ian Jackson

Gazebo out of scaffolding

Today we completed our gazebo, which we designed and built out of scaffolding:

Picture of gazebo

Scaffolding is fairly expensive but building things out of it is enormous fun! You can see a complete sequence of the build process, including pictures of the "engineering maquette", at https://www.chiark.greenend.org.uk/~ijackson/2020/scaffold/

Post-lockdown maybe I will build a climbing wall or something out of it...

edited 2020-11-08 20:44Z to fix img url following hosting reorg



comment count unavailable comments

08 November, 2020 08:44PM

Russell Coker

November 06, 2020

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

plocate in backports

plocate 1.0.7 hit Debian backports.org today, which means that it's now available for use on Debian stable (and machines with zillions of files are not unlikely to run stable). (The package page still says 1.0.5, but 1.0.7 really is up.)

The other big change from 1.0.5 is that plocate now escapes potentially dangerous filenames, like modern versions of GNU ls does. I'm not overly happy about this, but it's really hard not to do it; any user can create a publicly-viewable file on the system, and allowing them to sneak in arbitrary escape sequences just is too much. Most terminals should be good by now, but there have been many in the past with potentially really dangerous behavior (like executing arbitrary commands!), and just being able to mess up another user's terminal isn't a good idea. Most users should never really see filenames being escaped, but if you have a filename with \n or other nonprintable characters in, they will be escaped using bash' $'foo\nbar' syntax.

So, anyone up for uploading to Fedora? :-)

06 November, 2020 04:15PM

Sandro Tosi

QNAP firmware 4.5.1.1465: disable ssh management menu

as a good boy i just upgraded my QNAP NAS to the latest available firmware, 4.5.1.1465, but after the reboot there's an ugly surprise awaiting for me

once i ssh'd into the box to do my stuff, instead of a familiar bash prompt i'm greeted by a management menu that allows me to perform some basic management tasks or quit it and go back to the shell. i dont really need this menu (in particular because i have automations that regularly ssh into the box and they are not meant to be interactive).

to disable it: edit /etc/profile and comment the line "[[ "admin" = "$USER" ]] && /sbin/qts-console-mgmt -f" (you can judge me later for sshing as root)

06 November, 2020 02:42AM by Sandro Tosi (noreply@blogger.com)

November 05, 2020

hackergotchi for Mike Gabriel

Mike Gabriel

Welcome, Fre(i)e Software GmbH

Last week I received the official notice: There is now a German company named "Fre(i)e Software GmbH" registered with the German Trade Register.

Founding a New Company

Over the past months I have put my energy into founding a new company. As a freelancing IT consultant I started facing the limitation of other companies having strict policies that forbid the cooperation with one person businesses (Personengesellschaften).

Thus, the requirement for setting up a GmbH business came onto my agenda. I will move some of my business activities into this new company, starting next year.

Policy Ideas

The "Fre(i)e Software GmbH" will be a platform to facilitate the growth and spreading of Free Software on this planet.

Here are some first ideas for company policies:

  • The idea is to bring together teams of developers and consultants that provide the highest expertise in FLOSS.

  • Everything this company will do, will finally (or already during the development cycles) be published under some sorf of a free software / content license (for software, ideally a copyleft license).

  • Staff members will work and live across Europe, freelancers may possibly live in any country where German businesses may do business with.

  • Ideally, staff members and freelancers work on projects that they can identify themselves with, projects that they love.

  • Software development and software design is an art. In the company we will honour this. We will be artists.

  • In software development, we will enroll our customers in non-CLA FLOSS copyright holdership policies: developers can become copyright holders of the worked-on code projects as persons. This will strengthen the liberty nature of the FLOSS licensed code brought forth in the company.

  • The Fre(i)e Software GmbH will be a business focussing on sustainability and sufficiency. We will be gentle to our planet. We won't work on projects that create artificial needs.

  • We all will be experts in communication. We all will continually work on improving our communication skills.

  • Integrity shall be a virtue to strive for in the company.

  • We will be honest to ourselves and the customers about the mistakes we do, the misassumptions we have.

  • We will honour and support diversity.

This is all pretty fresh. I'll be happy about hearing your feedback, ideas and worries. If you are interested in joining the company, please let me know. If you are interested in supporting a company with such values, please also let me know.

light+love
Mike Gabriel (aka sunweaver)

05 November, 2020 11:38AM by sunweaver

hackergotchi for Jonathan Dowland

Jonathan Dowland

PhD Year 3 progression

I'm now into my 4th calendar year of my part-time PhD, corresponding to half-way through Stage 2, or Year 2 for a full-time student. Here's a report I wrote that describes what I did last year and what I'm working on going forward.

year3 progression report.pdf (223K PDF)

05 November, 2020 09:33AM

hackergotchi for Junichi Uekawa

Junichi Uekawa

Sent a pull request to document sshfs slave mode.

Sent a pull request to document sshfs slave mode. Every time I try to do it I forget, so at least I have a document about how to do it. Also changed the name from slave to passive, but I don't think that will help me remember... Not feeling particularly creative about the name.

05 November, 2020 01:34AM by Junichi Uekawa

November 04, 2020

hackergotchi for Ben Hutchings

Ben Hutchings

Debian LTS work, October 2020

I was assigned 6.25 hours of work by Freexian's Debian LTS initiative and carried over 17.5 hours from earlier months. I worked 11.5 hours this month and returned 7.75 hours to the pool, so I will carry over 4.5 hours to December.

I updated linux-4.19 to include the changes in DSA-4772-1, and issued DLA-2417-1 for this.

I updated linux (4.9 kernel) to include upstream stable fixes, and issued DLA-2420-1. This resulted in a regression on some Xen PV environments. Ian Jackson identified the upstream fix for this, which had not yet been applied to all the stable branches that needed it. I made a further update with just that fix, and issued DLA-2420-2.

I have also been working to backport fixes for some less urgent security issues in Linux 4.9, but have not yet applied those fixes.

04 November, 2020 07:38PM

hackergotchi for Martin-&#201;ric Racine

Martin-Éric Racine

Migrating to Predictable Network Interface Names

A couple of years ago, I moved into a new flat that comes with RJ45 sockets wired for 10 Gigabit (but currently offering 1 Gigabit) Ethernet.

This also meant changing the settings on my router box for my new ISP.

I took this opportunity to review my router's other settings too. I'll be blogging about these over the next few posts.

Migrating to Predictable Network Interface Names

Ever since Linus decided to flip the network interface enumeration order in the Linux kernel, I had been relying on udev's persistent network interface rules to maintain some semblance of consistency in the NIC naming scheme of my hosts. It has never been a totally satisfactory method, since it required manually editing the file to list the MAC addresses of all Ethernet cards and WiFi dongles likely to appear on that host to consistently use an easy-to-remember name that I could adopt for ifupdown configuration files.

Enter predictable interface names. What started as a Linux kernel module project at Dell was eventually re-implemented in systemd. However, clear documentation on the naming scheme had been difficult to find and udev's persistent network interface rules gave me what I needed, so I postponed the transition for years. Relocating to a new flat and rethinking my home network to match gave me an opportunity to revisit the topic.

The naming scheme is surprisingly simple and logical, once proper explanations have been found. The short version:

  • Ethernet interfaces are called en i.e. Ether Net.
  • Wireless interfaces are called wl i.e. Wire Less. (yes, the official documentation call this Wireless Local but, in everyday usage, remembering Wire Less is simpler)

The rest of the name specifies on which PCI bus and which slot the interface is found. On my old Dell laptop, it looks like this:

  • enp9s0: Ethernet interface at PCI bus 9 slot 0.
  • wlp12s0: Wireless interface at PCI bus 12 slot 0.

An added bonus of the naming scheme is that it makes replacing hardware a breeze, since the naming scheme is bus and slot specific, not MAC address specific. No need to edit any configuration file. I saw this first-hand when I got around upgrading my router's network cards to Gigabit-capable ones to take advantage of my new home's broadband LAN. All it took was to power off the host, swap the Ethernet cards and power on the host. That's it. systemd took care of everything else.

Still, migrating looked like a daunting task. Debian's wiki page gave me some answers, but didn't provide a systematic approach. I came up with the following shell script:

#!/bin/sh
lspci | grep -i -e ethernet -e network
sudo dmesg | grep -i renamed
for n in $(ls -X /sys/class/net/ | grep -v lo);
do
  echo $n: && udevadm test-builtin net_id /sys/class/net/$n 2>/dev/null | grep NAME;
  sudo rgrep $n /etc
  sudo find /etc -name '*$n*'
done

This combined ideas found on the Debian wiki with a few of my own. Running the script before and after the migration ensured that I hadn't missed any configuration file. Once I was satisfied with that, I commented out the old udev persistent network interface rules, ran dpkg-reconfigure on all my Linux kernel images to purge the rules from the initrd images, and called it a day.

... well, not quite. It turns out that with bridge-utils, bridge_ports all no longer works. One must manually list all interfaces to be bridged. Debian bug report filed.

PS: Luca Capello pointed out that Debian 10/Buster's Release Notes include migration instructions.

04 November, 2020 04:58PM by Martin-Éric (noreply@blogger.com)

hackergotchi for Jonathan Dowland

Jonathan Dowland

Amiga mouse pointer

I've started struggling to see my mouse pointer on my desktop. I probably need to take an eye test and possibly buy some new glasses. In the meantime, I've changed my mouse pointer to something larger and more colourful: The Amiga Workbench 1.3 mouse pointer that I grew up with, but scaled up to 128x128 pixels (from 16x16, I think)

Desktop screenshot with Workbench 1.3 mouse pointer

See if you can spot it

The X file format for mouse pointers is a bit strange but it turned out to be fairly easy to convert it over.

An enormous improvement!

04 November, 2020 04:35PM

hackergotchi for Norbert Preining

Norbert Preining

Debian KDE/Plasma Status 2020-11-04

About a month worth of updates on KDE/Plasma in Debian has accumulated, so here we go. The highlights are: Plasma 5.19.5 based on Qt 5.15 is in Debian/experimental and hopefully soon in Debian/unstable, and my own builds at OBS have been updated to Plasma 5.20.2, Frameworks 5.75, Apps 20.08.2.

Thanks to the dedicated work of the Qt maintainers, Qt 5.15 has finally entered Debian/unstable and we can finally target Plasma 5.20.

OBS packages

The OBS packages as usual follow the latest release, and currently ship KDE Frameworks 5.75, KDE Apps 20.08.2, and new, Plasma 5.20.2. The package sources are as usual (note the different path for the Plasma packages and the App packages, containing the release version!), for Debian/unstable:

deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/frameworks/Debian_Unstable/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/plasma520/Debian_Unstable/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/apps2008/Debian_Unstable/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/other/Debian_Unstable/ ./

and the same with Testing instead of Unstable for Debian/testing.

The update to Plasma 5.20 took a bit of time, not only because of the wait for Qt 5.15, but also because I couldn’t get it running on my desktop, only in the VM. It turned out that the Plasmoid Event Calendar needed an update, and the old version crashed Plasma (“v68 and below crash in Arch after the Qt 5.15.1 update.”). After I realized that, it was only a question of updating to get Plasma 5.20 running.

There are two points I have to mention (and I will fix sooner or later):

  • Update will need two trials because files moved from plasma-desktop to plasma-workspace. I will add the required replace/conflicts later.
  • Make sure that kwayland-server packages (libkwaylandserver5, libkwaylandserver-dev are at version 5.20.2. Some old versions had an epoch so automatic updates will not work.

As usual, let me know your experience!

Debian main packages

The packages in Debian/experimental are at the most current state, 5.19.5. We have waited with the upload to unstable until the Qt 5.15 transition is over, but hope to upload to unstable rather soon. After the upload is done, we will work on getting 5.20 into unstable.

My aim is to get the most recent version of Plasma 5.20 into Debian Bullseye, so we need to do that before the freeze early next year. Let us hope for the best.

04 November, 2020 01:26AM by Norbert Preining

November 03, 2020

hackergotchi for Martin-&#201;ric Racine

Martin-Éric Racine

Adding IPv6 support to my home LAN

A couple of year ago, I moved into a new flat that comes with RJ45 sockets wired for 10 Gigabit (but currently offering 1 Gigabit) Ethernet.

This also meant changing the settings on my router box for my new ISP.

I took this opportunity to review my router's other settings too. I'll be blogging about these over the next few posts.

Adding IPv6 support to my home LAN

I have been following the evolution of IPv6 ever since the KAME project produced the first IPv6 implementation. I have also been keeping track of the IPv4 address depletion.

Around the time the IPv6 Day was organized in 2011, I started investigating the situation of IPv6 support at local ISPs.

Well, never mind all those rumors about Finland being some high-tech mecca. Back then, no ISP went beyond testing their routers for IPv6 compatibility and producing white papers on what their limited test deployments accomplished.

Not that it matters much, in practice. Most IPv6 documentation out there, including Debian's own, still focuses on configuring transitional mechanisms, especially how to connect to a public IPv6 tunnel broker.

Relocating to a new flat and rethinking my home network to match gave me an opportunity to revisit the topic. Much to my delight, my current ISP offers native IPv6.

This prompted me to go back and read up on IPv6 one more time. One important detail:

IPv6 hosts are globally reachable.

The implications of this don't immediately spring to mind for someone used to IPv4 network address translation (NAT):

Any network service running on an IPv6 host can be reached by anyone anywhere.

Contrary to IPv4, there is no division between private and public IP addresses. Whereas a device behind an IPv4 NAT essentially is shielded from the outside world, IPv6 breaks this assumption in more than one way. Not only is the host reachable from anywhere, its default IPv6 address is a mathematical conversion (EUI-64) of the network interface's MAC address, which makes every connection forensically traceable to a unique device.

Basically, if you hadn't given much thought to firewalls until now, IPv6 should give you enough goose bumps to get around it. Tightening the configuration of every network service is also an absolute must. For instance, I configured sshd to only listen to private IPv4 addresses.

What /etc/network/interfaces might look like on an dual-stack (IPv4 + IPv6) host:

allow-hotplug enp9s0

iface enp9s0 inet dhcp
iface enp9s0 inet6 auto
	privext 2
	dhcp 1

The auto method means that IPv6 will be auto-configured using SLAAC; privext 2 enables IPv6 privacy options and specifies that we prefer connecting via the randomly-generated IPv6 address, rather than the EUI-64 calculated MAC specific address; dhcp 1 enables passive DHCPv6 to fetch additional routing information.

The above works for most desktop and laptop configurations.

Where things got more complicated is on the router. I decided early on to keep NAT to provide an IPv4 route to the outside world. Now how exactly is IPv6 routing done? Every node along the line must have its own IPv6 address... including the router's LAN interface. This is accomplished using the sample script found in Debian's IPv6 prefix delegation wiki page. I modified mine as follows (the rest of the script is omitted for clarity):

#Both LAN interfaces on my private network are bridged via br0
IA_PD_IFACE="br0"
IA_PD_SERVICES="dnsmasq"
IA_PD_IPV6CALC="/usr/bin/ipv6calc"

Just put the script at the suggested location. We'll need to request a prefix on the router's outside interface to utilize it. This gives us the following interfaces file:

allow-hotplug enp2s4 enp2s8 enp2s9
auto br0

iface enp2s4 inet dhcp
iface enp2s4 inet6 auto
	request_prefix 1
	privext 2
	dhcp 1

iface enp2s8 inet manual
iface enp2s8 inet6 manual

iface enp2s9 inet manual
iface enp2s9 inet6 manual

iface br0 inet static
	bridge_ports enp2s8 enp2s9
	address 10.10.10.254

iface br0 inet6 manual
	bridge_ports enp2s8 enp2s9
	# IPv6 from /etc/dhcp/dhclient-exit-hooks.d/prefix_delegation

The IPv4 NAT and IPv6 Bridge script on my router looks as follows:

#!/bin/sh
PATH="/usr/sbin:/sbin:/usr/bin:/bin"
wan=enp2s4
lan=br0
########################################################################
# IPv4 NAT
iptables -F; iptables -t nat -F; iptables -t mangle -F
iptables -X; iptables -t nat -X; iptables -t mangle -X
iptables -Z; iptables -t nat -Z; iptables -t mangle -Z
iptables -t nat -A POSTROUTING -o $wan -j MASQUERADE
echo 1 > /proc/sys/net/ipv4/ip_forward
########################################################################
# IPv6 bridge
ip6tables -F; ip6tables -X; ip6tables -Z
# Default policy DROP
ip6tables -P FORWARD DROP
# Allow ICMPv6 forwarding
ip6tables -A FORWARD -p ipv6-icmp -j ACCEPT
# Allow established connections
ip6tables -I FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT
# Accept packets FROM LAN to everywhere
ip6tables -I FORWARD -i $lan -j ACCEPT
echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
echo 1 > /proc/sys/net/ipv6/conf/default/forwarding
# IPv6 propagation via /etc/dhcp/dhclient-exit-hooks.d/prefix_delegation

The above already provided enough IPv6 connectivity to pass the IPv6 test on my desktop inside the LAN.

To make things more fun, I enabled DHCPv6 support for my LAN on the router's dnsmasq by adding the last 3 lines to the configuration:

dhcp-hostsfile=/etc/dnsmasq-ethersfile
bind-interfaces
interface=br0
except-interface=enp2s4
no-dhcp-interface=enp2s4
dhcp-range=tag:br0,10.10.10.0,static,infinite
dhcp-range=tag:br0,::1,constructor:br0,ra-names,ra-stateless,infinite
enable-ra
dhcp-option=option6:dns-server,[::],[2606:4700:4700::1111],[2001:4860:4860::8888]

The 5 first lines (included here for emphasis) are extremely important: they ensure that dnsmasq won't provide any IPv4 or IPv6 service to the outside interface (enp2s4) and that DHCP will only be provided for LAN hosts whose MAC address is known. Line 6 shows how dnsmasq's DHCP service syntax differs between IPv4 and IPv6. The rest of my configuration was omitted on purpose.

Enabling native IPv6 on my LAN has been an interesting experiment. I'm sure that someone could come up with even better ip6tables rules for the router or for my desktop hosts. Feel free to mention them in the blog's comment.

03 November, 2020 07:21PM by Martin-Éric (noreply@blogger.com)

hackergotchi for Holger Levsen

Holger Levsen

20200801-debconf3

DebConf3

This tshirt is 17 years old and from DebConf3. I should probably wash it at 60 celcius for once...

DebConf3 was my first DebConf and took place in Oslo, Norway, in 2003. I was very happy to be invited, like any Debian contributor at that time, and that Debian would provide food and accomodation for everyone. Accomodation was sleeping on the floor in some classrooms of an empty school and I remember having tasted grasshoppers provided by a friendly Gunnar Wolf there, standing in line on the first day with the SSH maintainer (OMG!1 (update: I originally wrote here that it wasn't Colin back then, but Colin mailed me to say that he was indeed maintaining SSH even back then, so I now believe I've meet the SVN maintainer Matt Kraii instead! :-))) and meeting the one Debian person I had actually worked with before: Thomas Lange or MrFAI (update: Thomas also mailed me and said this was at DebConf5). In Oslo I also was exposed to Skolelinux / Debian Edu for the first time, saw a certain presentation from the FTP masters and also noticed some people recording the talks, though as I learned later these videos were never released to the public. And there was this fiveteen year old called Toresbe, who powered on the PDP's which were double his age. And then actually made use of them. And and and.

I'm very happy I went to this DebConf. Without going my Debian journey would have been very different today. Thanks to everyone who made this such a welcoming event. Thanks to anyone who makes any event welcoming! :)

03 November, 2020 05:06PM

hackergotchi for Wouter Verhelst

Wouter Verhelst

Dear Google

... Why do you have to be so effing difficult about a YouTube API project that is used for a single event per year?

FOSDEM creates 600+ videos on a yearly basis. There is no way I am going to manually upload 600+ videos through your webinterface, so we use the API you provide, using a script written by Stefano Rivera. This script grabs video filenames and metadata from a YAML file, and then uses your APIs to upload said videos with said metadata. It works quite well. I run it from cron, and it uploads files until the quota is exhausted, then waits until the next time the cron job runs. It runs so well, that the first time we used it, we could upload 50+ videos on a daily basis, and so the uploads were done as soon as all the videos were created, which was a few months after the event. Cool!

The second time we used the script, it did not work at all. We asked one of our key note speakers who happened to be some hotshot at your company, to help us out. He contacted the YouTube people, and whatever had been broken was quickly fixed, so yay, uploads worked again.

I found out later that this is actually a normal thing if you don't use your API quota for 90 days or more. Because it's happened to us every bloody year.

For the 2020 event, rather than going through back channels (which happened to be unavailable this edition), I tried to use your normal ways of unblocking the API project. This involves creating a screencast of a bloody command line script and describing various things that don't apply to FOSDEM and ghaah shoot me now so meh, I created a new API project instead, and had the uploads go through that. Doing so gives me a limited quota that only allows about 5 or 6 videos per day, but that's fine, it gives people subscribed to our channel the time to actually watch all the videos while they're being uploaded, rather than being presented with a boatload of videos that they can never watch in a day. Also it doesn't overload subscribers, so yay.

About three months ago, I started uploading videos. Since then, every day, the "fosdemtalks" channel on YouTube has published five or six videos.

Given that, imagine my surprise when I found this in my mailbox this morning...

Google lies, claiming that my YouTube API project isn't being used for 90 days and informing me that it will be disabled

This is an outright lie, Google.

The project has been created 90 days ago, yes, that's correct. It has been used every day since then to upload videos.

I guess that means I'll have to deal with your broken automatic content filters to try and get stuff unblocked...

... or I could just give up and not do this anymore. After all, all the FOSDEM content is available on our public video host, too.

03 November, 2020 04:16PM

hackergotchi for Martin-&#201;ric Racine

Martin-Éric Racine

GRUB fine-tuning

A couple of years ago, I moved into a new flat that comes with RJ45 sockets wired for 10 Gigabit (but currently offering 1 Gigabit) Ethernet.

This also meant changing the settings on my router box for my new ISP.

I took this opportunity to review my router's other settings too. I'll be blogging about these over the next few posts.

GRUB fine-tuning

One thing that had been annoying me ever since Debian migrated to systemd as /sbin/init is that boot message verbosity hasn't been the same. Previously, the cmdline option quiet merely suppressed the kernel's output to the bootscreen, but left the daemon startup messages alone. Not anymore. Nowadays, quiet produces a blank screen.

After some googling, I found the solution to that:

GRUB_CMDLINE_LINUX_DEFAULT="noquiet loglevel=5"

The former restores daemon startup messages, while the later makes the kernel output only significant notices or more serious messages. On most of my hosts, it mostly reports inconsistencies in the ACPI configuration of the BIOS.

Another setting I find useful is a reboot delay in case a kernel panic happens:

GRUB_CMDLINE_LINUX="panic=15"

This gives me enough time to snap a picture of the screen output to attach to the bug report that will follow.

03 November, 2020 03:48PM by Martin-Éric (noreply@blogger.com)

hackergotchi for Daniel Pocock

Daniel Pocock

The harassment claims in full

After announcing my nomination for Fedora Council, I received a few queries about the truth of the harassment claims. Certain people tried to muddy the waters with counter-accusations and fake news but we can find the truth by going back to the original words chosen by these women.

Helping victims of harassment is not an easy thing but it is something that any of us may be called upon to do from time to time. It is an inevitable burden of being in a leadership position.

I'd like to make the UN's Sustainable Development Goal 8, Decent Work and Economic Growth a core theme of my campaign for Fedora Council. Harassment is part of Target 8.8, safe working environments.

Harassment is never an easy thing to assist with but it is a lot harder when an organization fails to support volunteers like myself who take on these responsibilities.

A woman's public harassment / stalking complaint against Google

In early 2018, I was mentoring a woman from a developing country in the Outreachy program. The woman wrote a blog about mass surveillance (backup link in case of Github censorship), she included the following picture:

Google, Stalking, Harassment, women, interns, Outreachy

I admit making a mistake with this: I thought the blog post was political. I failed to see it for what it was, the woman I was responsible for mentoring had made a public written complaint against one of the sponsors. Her picture is clear evidence that:

  • Google, a sponsor, is stalking her
  • She does not freely consent to what Google does or she did not consent at all
  • She feels uncomfortable about it

Why are we so de-sensitized to this?

She is not alone. A British House of Commons committee has declared these firms to be digital gangsters in an official report. In October, the US Department of Justice branded Google's business model unlawful. Had a volunteer used words like digital gangster on a mailing list, somebody on the Google gravy train would have immediately accused them of harassment. Metaphors like digital gangster that help our minds to focus the spotlight on Google doing the wrong thing are a Code of Conduct violation. That is not a fault with the metaphor, it is a fault with the way people on Google's gravy train apply the Code of Conduct.

Yet the UK Parliament's Digital, Culture, Media and Sport Committee, the US Department of Justice and my intern were all right, whether Google sympathizers like to hear it or not.

Proponents of Codes of Conduct in free software organizations see this as a giant conspiracy and now they make counter-accusations of harassment against all of us who do not consent to Google's behavior. After that blog, various emails appeared where the intern and I were both wrongly accused of harassment of the vulnerable Google employees snug in their ivory towers.

The blog was syndicated on Planet Outreach, various other Planet sites and social media. It was noticed in many online forums where Google employees try to work alongside volunteers in free software projects. My intern had just pointed out that those people are not unlike peeping Toms in many ways. She did it in a very public way, while being paid by a Google-sponsored diversity program. Some of them felt bad.

Behind my back, other people in the organization sent official-looking threats to my intern. They didn't tell me about this, they do this behind the mentor's back because they know that threatening people is not quite right.

When an intern insults Google, Eric Schmidt doesn't come out on the TV, shed crocodile tears and claim he is being harassed. Outsiders who have received money from Google do it in more subtle ways. For example, after a series of interactions in my absence, they sent threats like this:

These interactions seemed to have set a very uncomfortable environment ... we are considering requesting a rejection of your application to the bursaries team

Chilling.

What bothers me most is that it took several months before this woman could tell me about the threats she received and how she felt. These were not threats from anonymous internet trolls, they were from people in leadership roles. Recipients of emails like that feel shame, even when they do nothing wrong and it prevents them speaking.

Shame and guilt are often factors in self-harm and suicide. For example, women are some of the biggest victims of problems relating to body image. It is ironic that in a diversity program, Google apologists are cultivating shame in women and increasing the risk of harm.

This is why it is so critical that there are a number of independent figures on the Fedora Council. Please consider nominating before the deadline. If a victim of harassment is not comfortable talking to some of us, having at least one person who she is comfortable talking to can make a huge difference.

A woman's complaint against local men in her region

The other harassment complaint came from a woman in a very different region.

1. Outreachy ineligibility

To completely understand the contradictions in this situation, we need to look over an extended timeline and consider multiple rounds of selection processes. This is one of many women who I had met at events. She had applied to many internships, including Outreachy. Before the harassment incident, Sage Sharp had a discussion with administrators and mentors and removed the woman from the shortlist with these comments:

From: Sage Sharp

[... snip ...] While you may not have intended it this way, this sounds like favoritism, since the candidates are in the same city as the mentor.

[... snip ...] Again, this sounds a bit like favoritism of a person who is already active in open source.

[... snip ...] Again, this may be how your experience with GSoC works, but the goal of Outreachy is to try and get people who aren't currently open source contributors (or who are not as active contributors) to be accepted as Outreachy interns. That's why we have the rule that past GSoC or Outreachy applicants are ineligible to apply.

2. An inconvenient truth

Some time later, the woman wrote a message to the leader of a free software organization, confirmed the source of harassment in her local environment and thanked me for the help I provide to women...

Thank you very much for being so supportive.

I read the comments on the thread and to be honest I am really sad that [REDACTED] said that. It is not true at all.

They ([REDACTED++]) pretend to support women but on the other hand their behavior towards many of us shows the opposite.

Daniel I feel bad because you have encouraged and helped not only me, but so many other people, no matter if they are [REDACTED] members or not, and also all the attendees from [REDACTED] to learn new things, to work and improve their skills and knowledge. They are doubting your good intentions just to remove the attention from the shady things that they are doing.

The [REDACTED] comment is really offensive to me and i feel it should be offensive to every woman who is part of the community.
I have been contributing and supporting [REDACTED] since its early days, and I have put a lot of effort and time, I do this because I believe in what it is meant to stand for and without waiting something in exchange, but the situation lately has been not very positive. Daniel has been present by chance in few cases where situations have been very hard to go through.

I would definitely like to talk to any of you and tell you more about everything that is happening here, its fine to me whether it is a video call, call or just emails.
Please tell me what would be more convenient to you.

It is worth looking at how the leader of the organization handled the concerns from this woman:

please feel to drop Daniel from any replies you wish to make, if you even wish to do so

I had gone to another city for a technical event. When young women asked if I would listen to the challenges they are facing, I gave them several hours of my time. The leader of the organization dismissed them with one of his typical one-line insults.

As with the other intern, that was a sign that somebody was hiding something.

3. Outreachy as hush money

It looks like they wanted the matter to go away. Some time later, the woman received an Outreachy internship from somebody else, completely contradicting the policy Sage Sharp had previously ruled on.

They knew the woman was interested in an internship and so they did this backflip, they broke their own rules to offer her one. They hoped to give her an incentive to stay quiet and stop talking to independent leadership figures like me about the real sources of harassment. This is diversity funding at its best, absorbing women into groupthink.

What really motivated them to bend the rules so dramatically in this case?

Summary

These cases demonstrate that while there are very different ways of handling harassment complaints, the proverbial carrot and stick, ranging from threats to hush money, the root causes of the problem are rarely taken seriously.

I volunteered to mentor young women in this program because I believed it was about empowerment and equality. What I've seen in practice suggests the opposite. In these cases, the program is training women to be obedient, submissive and silent.

None of those strategies will assist us with Sustainable Development Goal 8 and reducing hazards in the free software environment.

If you have experienced harassment in the free software community, in any organization, my first recommendation is that you seek advice from somebody you can reach in the real world and if it is a serious case, please seek assistance from a professional like your doctor. If elected to Fedora Council, I hope to be there for everybody but nonetheless, somebody you correspond with in an online environment is never equivalent to somebody you can meet in the real world. During this difficult time when many of us are spending more time than usual at home during the pandemic, it is even more critical to seek support from sources you can rely on.

Resources

03 November, 2020 12:00PM

hackergotchi for Martin Michlmayr

Martin Michlmayr

ledger2beancount 2.5 released

I released version 2.5 of ledger2beancount, a ledger to beancount converter.

Here are the changes in 2.5:

  • Don't create negative cost for lot without cost
  • Support complex implicit conversions
  • Handle typed metadata with value 0 correctly
  • Set per-unit instead of total cost when cost is missing from lot
  • Support commodity-less amounts
  • Convert transactions with no amounts or only 0 amounts to notes
  • Fix parsing of transaction notes
  • Keep tags in transaction notes on same line as transaction header
  • Add beancount config options for non-standard root names automatically
  • Fix conversion of fixated prices to costs
  • Fix removal of price when price==cost but when they use different number formats
  • Fix removal of price when price==cost but per-unit and total notation mixed
  • Fix detection of tags and metadata after posting/aux date
  • Use D directive to set default commodity for hledger
  • Improve support for postings with commodity-less amounts
  • Allow empty comments
  • Preserve leading whitespace in comments in postings and transaction headers
  • Preserve indentation for tags and metadata
  • Preserve whitespace between amount and comment
  • Refactor code to use more data structures
  • Remove dependency on Config::Onion module

Thanks to input from Remco Rijnders, Yuri Khan, and Thierry. Thanks to Stefano Zacchiroli and Kirill Goncharov for testing my changes.

You can get ledger2beancount from GitHub

03 November, 2020 07:52AM by Martin Michlmayr