23 May 2019

feedPlanet Debian

Michael Stapelberg: Optional dependencies don’t work

In the i3 projects, we have always tried hard to avoid optional dependencies. There are a number of reasons behind it, and as I have recently encountered some of the downsides of optional dependencies firsthand, I summarized my thoughts in this article.

What is a (compile-time) optional dependency?

When building software from source, most programming languages and build systems support conditional compilation: different parts of the source code are compiled based on certain conditions.

An optional dependency is conditional compilation hooked up directly to a knob (e.g. command line flag, configuration file, …), with the effect that the software can now be built without an otherwise required dependency.

Let's walk through a few issues with optional dependencies.

Inconsistent experience in different environments

Software is usually not built by end users, but by packagers, at least when we are talking about Open Source.

Hence, end users don't see the knob for the optional dependency, they are just presented with the fait accompli: their version of the software behaves differently than other versions of the same software.

Depending on the kind of software, this situation can be made obvious to the user: for example, if the optional dependency is needed to print documents, the program can produce an appropriate error message when the user tries to print a document.

Sometimes, this isn't possible: when i3 introduced an optional dependency on cairo and pangocairo, the behavior itself (rendering window titles) worked in all configurations, but non-ASCII characters might break depending on whether i3 was compiled with cairo.

For users, it is frustrating to only discover in conversation that a program has a feature that the user is interested in, but it's not available on their computer. For support, this situation can be hard to detect, and even harder to resolve to the user's satisfaction.

Packaging is more complicated

Unfortunately, many build systems don't stop the build when optional dependencies are not present. Instead, you sometimes end up with a broken build, or, even worse: with a successful build that does not work correctly at runtime.

This means that packagers need to closely examine the build output to know which dependencies to make available. In the best case, there is a summary of available and enabled options, clearly outlining what this build will contain. In the worst case, you need to infer the features from the checks that are done, or work your way through the --help output.

The better alternative is to configure your build system such that it stops when any dependency was not found, and thereby have packagers acknowledge each optional dependency by explicitly disabling the option.

Untested code paths bit rot

Code paths which are not used will inevitably bit rot. If you have optional dependencies, you need to test both the code path without the dependency and the code path with the dependency. It doesn't matter whether the tests are automated or manual, the test matrix must cover both paths.

Interestingly enough, this principle seems to apply to all kinds of software projects (but it slows down as change slows down): one might think that important Open Source building blocks should have enough users to cover all sorts of configurations.

However, consider this example: building cairo without libxrender results in all GTK application windows, menus, etc. being displayed as empty grey surfaces. Cairo does not fail to build without libxrender, but the code path clearly is broken without libxrender.

Can we do without them?

I'm not saying optional dependencies should never be used. In fact, for bootstrapping, disabling dependencies can save a lot of work and can sometimes allow breaking circular dependencies. For example, in an early bootstrapping stage, binutils can be compiled with --disable-nls to disable internationalization.

However, optional dependencies are broken so often that I conclude they are overused. Read on and see for yourself whether you would rather commit to best practices or not introduce an optional dependency.

Best practices

If you do decide to make dependencies optional, please:

  1. Set up automated testing for all code path combinations.
  2. Fail the build until packagers explicitly pass a --disable flag.
  3. Tell users their version is missing a dependency at runtime, e.g. in --version.

23 May 2019 12:54pm GMT

feedPlanet Python

Codementor: Building Machine Learning Data Pipeline using Apache Spark

Apache Spark (.) is increasingly becoming popular in the field of Data Sciences because of its ability to deal with the huge datasets and the capability to run computations in memory which is...

23 May 2019 12:33pm GMT

EuroPython: EuroPython 2019: Monday and Tuesday activities for main conference attendees

Although the main conference starts on Wednesday, July 10th, there's already so much to do for attendees with the main conference ticket on Monday 8th and Tuesday 9th.

Beginners' Day and Sponsored Trainings

You can come to the workshops and trainings venue at FHNW Campus Muttenz and:

If you want to attend other workshops and trainings, you'll need a separate training ticket or combined ticket.

Details on the Beginners' Day workshop and the sponsored trainings will be announced separately.

Catering on training days not included

Since we have to budget carefully, the lunch and coffee breaks are not included, if you don't have a training ticket or combined ticket.

To not keep you hungry, we have arranged that you can buy lunch coupons (price to be announced later). You can also go to the grocery store at the ground floor. For coffee breaks you can go to the ground floor, to the 12th floor of the FHNW building, or outside at the beach bar (nice weather only) and buy drinks.

Enjoy,
-
EuroPython 2019 Team
https://ep2019.europython.eu/
https://www.europython-society.org/

23 May 2019 8:01am GMT

Test and Code: 75: Modern Testing Principles - Alan Page

Software testing, if done right, is done all the time, throughout the whole life of a software project. This is different than the verification and validation of a classical model of QA teams. It's more of a collaborative model that actually tries to help get great software out the door faster and iterate quicker.

One of the people at the forefront of this push is Alan Page. Alan and his podcast cohost Brent Jensen tried to boil down what modern testing looks like in the Modern Testing Principles.

I've got Alan here today, to talk about the principles, and also to talk about this transition from classical QA to testing specialists being embedded in software teams and then to software teams doing their own testing.

But that only barely scratches the surface of what we cover. I think you'll learn a lot from this discussion.

The seven principles of Modern Testing:

  1. Our priority is improving the business.
  2. We accelerate the team, and use models like Lean Thinking and the Theory of Constraints to help identify, prioritize and mitigate bottlenecks from the system.
  3. We are a force for continuous improvement, helping the team adapt and optimize in order to succeed, rather than providing a safety net to catch failures.
  4. We care deeply about the quality culture of our team, and we coach, lead, and nurture the team towards a more mature quality culture.
  5. We believe that the customer is the only one capable to judge and evaluate the quality of our product
  6. We use data extensively to deeply understand customer usage and then close the gaps between product hypotheses and business impact.
  7. We expand testing abilities and knowhow across the team; understanding that this may reduce (or eliminate) the need for a dedicated testing specialist.

Special Guest: Alan Page.

Sponsored By:

Support Test & Code - Software Testing, Development, Python

Links:

<p>Software testing, if done right, is done all the time, throughout the whole life of a software project. This is different than the verification and validation of a classical model of QA teams. It&#39;s more of a collaborative model that actually tries to help get great software out the door faster and iterate quicker. </p> <p>One of the people at the forefront of this push is Alan Page. Alan and his podcast cohost Brent Jensen tried to boil down what modern testing looks like in the Modern Testing Principles.</p> <p>I&#39;ve got Alan here today, to talk about the principles, and also to talk about this transition from classical QA to testing specialists being embedded in software teams and then to software teams doing their own testing.</p> <p>But that only barely scratches the surface of what we cover. I think you&#39;ll learn a lot from this discussion.</p> <p><strong>The seven principles of <a href="http://moderntesting.org" rel="nofollow">Modern Testing</a>:</strong> </p> <ol> <li>Our priority is improving the business.</li> <li>We accelerate the team, and use models like Lean Thinking and the Theory of Constraints to help identify, prioritize and mitigate bottlenecks from the system.</li> <li>We are a force for continuous improvement, helping the team adapt and optimize in order to succeed, rather than providing a safety net to catch failures.</li> <li>We care deeply about the quality culture of our team, and we coach, lead, and nurture the team towards a more mature quality culture.</li> <li>We believe that the customer is the only one capable to judge and evaluate the quality of our product</li> <li>We use data extensively to deeply understand customer usage and then close the gaps between product hypotheses and business impact.</li> <li>We expand testing abilities and knowhow across the team; understanding that this may reduce (or eliminate) the need for a dedicated testing specialist. </li> </ol><p>Special Guest: Alan Page.</p><p>Sponsored By:</p><ul><li><a href="https://www.patreon.com/testpodcast" rel="nofollow">Patreon Supporters</a>: <a href="https://www.patreon.com/testpodcast" rel="nofollow">Help support the show with as little as $1 per month. Funds help pay for expenses associated with the show.</a></li></ul><p><a href="https://www.patreon.com/testpodcast" rel="payment">Support Test & Code - Software Testing, Development, Python</a></p><p>Links:</p><ul><li><a href="https://angryweasel.com/blog/" title="Tooth of the Weasel - notes and rants about software and software quality" rel="nofollow">Tooth of the Weasel - notes and rants about software and software quality</a></li><li><a href="https://www.angryweasel.com/ABTesting/" title="AB Testing - Alan and Brent talk about Modern Testing - including Agile, Data, Leadership, and more." rel="nofollow">AB Testing - Alan and Brent talk about Modern Testing - including Agile, Data, Leadership, and more.</a></li><li><a href="https://www.angryweasel.com/ABTesting/modern-testing-principles/" title="Modern Testing Principles" rel="nofollow">Modern Testing Principles</a></li><li><a href="https://amzn.to/2WigLM2" title="The Lean Startup" rel="nofollow">The Lean Startup</a></li></ul>

23 May 2019 7:00am GMT

feedPlanet Debian

François Marier: Installing Ubuntu 18.04 using both full-disk encryption and RAID1

I recently setup a desktop computer with two SSDs using a software RAID1 and full-disk encryption (i.e. LUKS). Since this is not a supported configuration in Ubuntu desktop, I had to use the server installation medium.

This is my version of these excellent instructions.

Server installer

Start by downloading the alternate server installer and verifying its signature:

  1. Download the required files:

     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/ubuntu-18.04.2-server-amd64.iso
     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/SHA256SUMS
     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/SHA256SUMS.gpg
    
    
  2. Verify the signature on the hash file:

     $ gpg --keyid-format long --keyserver hkps://keyserver.ubuntu.com --recv-keys 0xD94AA3F0EFE21092
     $ gpg --verify SHA256SUMS.gpg SHA256SUMS
     gpg: Signature made Fri Feb 15 08:32:38 2019 PST
     gpg:                using RSA key D94AA3F0EFE21092
     gpg: Good signature from "Ubuntu CD Image Automatic Signing Key (2012) <cdimage@ubuntu.com>" [undefined]
     gpg: WARNING: This key is not certified with a trusted signature!
     gpg:          There is no indication that the signature belongs to the owner.
     Primary key fingerprint: 8439 38DF 228D 22F7 B374  2BC0 D94A A3F0 EFE2 1092
    
    
  3. Verify the hash of the ISO file:

     $ sha256sum ubuntu-18.04.2-server-amd64.iso 
     a2cb36dc010d98ad9253ea5ad5a07fd6b409e3412c48f1860536970b073c98f5  ubuntu-18.04.2-server-amd64.iso
     $ grep ubuntu-18.04.2-server-amd64.iso SHA256SUMS
     a2cb36dc010d98ad9253ea5ad5a07fd6b409e3412c48f1860536970b073c98f5 *ubuntu-18.04.2-server-amd64.iso
    
    

Then copy it to a USB drive:

dd if=ubuntu-18.04.2-server-amd64.iso of=/dev/sdX

and boot with it.

Inside the installer, use manual partitioning to:

  1. Configure the physical partitions.
  2. Configure the RAID array second.
  3. Configure the encrypted partitions last

Here's the exact configuration I used:

I only set /dev/sda2 as the EFI partition because I found that adding a second EFI partition would break the installer.

I created the following RAID1 arrays:

I used /dev/md0 as my unencrypted /boot partition.

Then I created the following LUKS partitions:

Post-installation configuration

Once your new system is up, sync the EFI partitions using DD:

dd if=/dev/sda1 of=/dev/sdb1

and create a second EFI boot entry:

efibootmgr -c -d /dev/sdb -p 1 -L "ubuntu2" -l \EFI\ubuntu\shimx64.efi

Ensure that the RAID drives are fully sync'ed by keeping an eye on /prod/mdstat and then reboot, selecting "ubuntu2" in the UEFI/BIOS menu.

Once you have rebooted, remove the following package to speed up future boots:

apt purge btrfs-progs

To switch to the desktop variant of Ubuntu, install these meta-packages:

apt install ubuntu-desktop gnome

then use debfoster to remove unnecessary packages (in particular the ones that only come with the default Ubuntu server installation).

Fixing booting with degraded RAID arrays

Since I have run into RAID startup problems in the past, I expected having to fix up a few things to make degraded RAID arrays boot correctly.

I did not use LVM since I didn't really feel the need to add yet another layer of abstraction of top of my setup, but I found that the lvm2 package must still be installed:

apt install lvm2

with use_lvmetad = 0 in /etc/lvm/lvm.conf.

Then in order to automatically bring up the RAID arrays with 1 out of 2 drives, I added the following script in /etc/initramfs-tools/scripts/local-top/cryptraid:

 #!/bin/sh
 PREREQ="mdadm"
 prereqs()
 {
      echo "$PREREQ"
 }
 case $1 in
 prereqs)
      prereqs
      exit 0
      ;;
 esac

 mdadm --run /dev/md0
 mdadm --run /dev/md1
 mdadm --run /dev/md2

before making that script executable:

chmod +x /etc/initramfs-tools/scripts/local-top/cryptraid

and refreshing the initramfs:

update-initramfs -u -k all

Disable suspend-to-disk

Since I use a random encryption key for the swap partition (to avoid having a second password prompt at boot time), it means that suspend-to-disk is not going to work and so I disabled it by putting the following in /etc/initramfs-tools/conf.d/resume:

RESUME=none

and by adding noresume to the GRUB_CMDLINE_LINUX variable in /etc/default/grub before applying these changes:

update-grub
update-initramfs -u -k all

Test your configuration

With all of this in place, you should be able to do a final test of your setup:

  1. Shutdown the computer and unplug the second drive.
  2. Boot with only the first drive.
  3. Shutdown the computer and plug the second drive back in.
  4. Boot with both drives and re-add the second drive to the RAID array:

     mdadm /dev/md0 -a /dev/sdb3
     mdadm /dev/md1 -a /dev/sdb4
     mdadm /dev/md2 -a /dev/sdb2
    
    
  5. Wait until the RAID is done re-syncing and shutdown the computer.

  6. Repeat steps 2-5 with the first drive unplugged instead of the second.
  7. Reboot with both drives plugged in.

At this point, you have a working setup that will gracefully degrade to a one-drive RAID array should one of your drives fail.

23 May 2019 4:30am GMT

22 May 2019

feedPlanet Debian

Charles Plessy: Register your media types to the IANA !

As the maintainer of the mime-support in Debian, I would like to give Kudos to Petter Reinholdtsen, who just opened a ticket at the IANA to create a text/vnd.sosi media type. May his example be followed by others!

22 May 2019 10:19pm GMT

feedPlanet Grep

Wim Leers: API-First Drupal: what's new in 8.7?

Drupal 8.7 was released with huge API-First improvements!

The REST API only got fairly small improvements in the 7th minor release of Drupal 8, because it reached a good level of maturity in 8.6 (where we added file uploads, exposed more of Drupal's data model and improved DX.), and because we of course were busy with JSON:API :)

Thanks to everyone who contributed!

  1. JSON:API #2843147

    Need I say more? :) After keeping you informed about progress in October, November, December and January, followed by one of the most frantic Drupal core contribution periods I've ever experienced, the JSON:API module was committed to Drupal 8.7 on March 21, 2019.

    Surely you're know thinking But when should I use Drupal's JSON:API module instead of the REST module? Well, choose REST if you have non-entity data you want to expose. In all other cases, choose JSON:API.

    In short, after having seen people use the REST module for years, we believe JSON:API makes solving the most common needs an order of magnitude simpler - sometimes two. Many things that require a lot of client-side code, a lot of requests or even server-side code you get for free when using JSON:API. That's why we added it, of course :) See Dries' overview if you haven't already. It includes a great video that Gabe recorded that shows how it simplifies common needs. And of course, check out the spec!

  2. datetime & daterange fields now respect standards #2926508

    They were always intended to respect standards, but didn't.

    For a field configured to store date + time:

    "field_datetime":[{
      "value": "2017-03-01T20:02:00",
    }]
    
    "field_datetime":[{
      "value": "2017-03-01T20:02:00+11:00",
    }]
    

    The site's timezone is now present! This is now a valid RFC3339 datetime string.

    For a field configured to store date only:

    "field_dateonly":[{
      "value": "2017-03-01T20:02:00",
    }]
    
    "field_dateonly":[{
      "value": "2017-03-01",
    }]
    

    Time information used to be present despite this being a date-only field! RFC3339 only covers combined date and time representations. For date-only representations, we need to use ISO 8601. There isn't a particular name for this date-only format, so we have to hardcode the format. See https://en.wikipedia.org/wiki/ISO_8601#Calendar_dates.

    Note: backward compatibility is retained: a datetime string without timezone information is still accepted. This is discouraged though, and deprecated: it will be removed for Drupal 9.

    Previous Drupal 8 minor releases have fixed similar problems. But this one, for the first time ever, implements these improvements as a @DataType-level normalizer, rather than a @FieldType-level normalizer. Perhaps that sounds too far into the weeds, but … it means it fixed this problem in both rest.module and jsonapi.module! We want the entire ecosystem to follow this example.

  3. rest.module has proven to be maintainable!

    In the previous update, I proudly declared that Drupal 8 core's rest.module was in a "maintainable" state. I'm glad to say that has proven to be true: six months later, the number of open issues still fit on "a single page" (fewer than 50). :) So don't worry as a rest.module user: we still triage the issue queue, it's still maintained. But new features will primarily appear for jsonapi.module.


Want more nuance and detail? See the REST: top priorities for Drupal 8.7.x issue on drupal.org.

Are you curious what we're working on for Drupal 8.8? Want to follow along? Click the follow button at API-First: top priorities for Drupal 8.8.x - whenever things on the list are completed (or when the list gets longer), a comment gets posted. It's the best way to follow along closely!1

Was this helpful? Let me know in the comments!


For reference, historical data:


  1. ~50 comments per six months - so very little noise. ↩︎

22 May 2019 11:00am GMT

21 May 2019

feedPlanet Grep

Staf Wagemakers: OPNsense Upgrade Failed: Out of Inodes

opnsense with no inodes

I use OPNsense as my firewall on a Pcengines Alix.

The primary reason is to have a firewall that will be always up-to-update, unlike most commercial customer grade firewalls that are only supported for a few years. Having a firewall that runs opensource software - it's based on FreeBSD - also make it easier to review and to verify that there are no back doors.

When I tried to upgrade it to the latest release - 19.1.7 - the upgrade failed because the filesystem ran out of inodes. There is already a topic about this at the OPNsense forum and a fix available for the upcoming nano OPNsense images.

But this will only resolve the issue when a new image becomes available and would require a reinstallation of the firewall.

Unlike ext2/¾ or Sgi's XFS on GNU/Linux, it isn't possible to increase the number of inodes on a UFS filesystem on FreeBSD.

To resolve the upgrade issue I created a backup of the existing filesystem, created a new filesystem with enough inodes, and restored the backup.

You'll find my journey of fixing the out of inodes issue below all commands are executed on a FreeBSD system. Hopefully this useful for someone.

Fixing the out of inodes

I connected the OPNsense CF disk to a USB cardreader on a FreeBSD virtual system.

Find the disk

Find the CF disk I use gpart show as this will also display the disk labels etc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
root@freebsd:~ # gpart show
=>      40  41942960  vtbd0  GPT  (20G)
        40      1024      1  freebsd-boot  (512K)
      1064       984         - free -  (492K)
      2048   4194304      2  freebsd-swap  (2.0G)
   4196352  37744640      3  freebsd-zfs  (18G)
  41940992      2008         - free -  (1.0M)

=>      0  7847280  da0  BSD  (3.7G)
        0  7847280    1  freebsd-ufs  (3.7G)

=>      0  7847280  ufsid/5a6b137a11c4f909  BSD  (3.7G)
        0  7847280                       1  freebsd-ufs  (3.7G)

=>      0  7847280  ufs/OPNsense_Nano  BSD  (3.7G)
        0  7847280                  1  freebsd-ufs  (3.7G)

root@freebsd:~ # 

Backup

Create a backup with the old-school dd, just in case.

1
2
3
4
5
6
root@freebsd:~ # dd if=/dev/da0 of=/home/staf/opnsense.dd bs=4M status=progress
  4013948928 bytes (4014 MB, 3828 MiB) transferred 415.226s, 9667 kB/s
957+1 records in
957+1 records out
4017807360 bytes transferred in 415.634273 secs (9666689 bytes/sec)
root@freebsd:~ # 

mount

Verify that we can mount the filesystem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
root@freebsd:/ # mount /dev/da0a /mnt
root@freebsd:/ # cd /mnt
root@freebsd:/mnt # ls
.cshrc                          dev                             net
.probe.for.install.media        entropy                         proc
.profile                        etc                             rescue
.rnd                            home                            root
COPYRIGHT                       lib                             sbin
bin                             libexec                         sys
boot                            lost+found                      tmp
boot.config                     media                           usr
conf                            mnt                             var
root@freebsd:/mnt # cd ..
root@freebsd:/ #  df -i /mnt
Filesystem 1K-blocks    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/da0a    3916903 1237690 2365861    34%   38203  6595   85%   /mnt
root@freebsd:/ # 

umount

1
root@freebsd:/ # umount /mnt

dump

We'll a create a backup with dump, we'll use this backup to restore it again after we created a new filesystem with enough inodes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@freebsd:/ # dump 0uaf /home/staf/opensense.dump /dev/da0a
  DUMP: Date of this level 0 dump: Sun May 19 10:43:56 2019
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/da0a to /home/staf/opensense.dump
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 1264181 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: 16.21% done, finished in 0:25 at Sun May 19 11:14:51 2019
  DUMP: 33.27% done, finished in 0:20 at Sun May 19 11:14:03 2019
  DUMP: 49.65% done, finished in 0:15 at Sun May 19 11:14:12 2019
  DUMP: 66.75% done, finished in 0:09 at Sun May 19 11:13:57 2019
  DUMP: 84.20% done, finished in 0:04 at Sun May 19 11:13:41 2019
  DUMP: 99.99% done, finished soon
  DUMP: DUMP: 1267205 tape blocks on 1 volume
  DUMP: finished in 1800 seconds, throughput 704 KBytes/sec
  DUMP: level 0 dump on Sun May 19 10:43:56 2019
  DUMP: Closing /home/staf/opensense.dump
  DUMP: DUMP IS DONE
root@freebsd:/ # 

newfs

According to the newfs manpage: We can specify the inode density with the -i option. A lower number will give use more inodes.

1
2
3
4
5
6
7
root@freebsd:/ # newfs -i 1 /dev/da0a
density increased from 1 to 4096
/dev/da0a: 3831.7MB (7847280 sectors) block size 32768, fragment size 4096
        using 8 cylinder groups of 479.00MB, 15328 blks, 122624 inodes.
super-block backups (for fsck_ffs -b #) at:
 192, 981184, 1962176, 2943168, 3924160, 4905152, 5886144, 6867136
root@freebsd:/ # 

mount and verify

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
root@freebsd:/ # mount /dev/da0a /mnt
root@freebsd:/ # df -i
Filesystem         1K-blocks    Used    Avail Capacity iused    ifree %iused  Mounted on
zroot/ROOT/default  14562784 1819988 12742796    12%   29294 25485592    0%   /
devfs                      1       1        0   100%       0        0  100%   /dev
zroot/tmp           12742884      88 12742796     0%      11 25485592    0%   /tmp
zroot/usr/home      14522868 1780072 12742796    12%      17 25485592    0%   /usr/home
zroot/usr/ports     13473616  730820 12742796     5%  178143 25485592    1%   /usr/ports
zroot/usr/src       13442804  700008 12742796     5%   84122 25485592    0%   /usr/src
zroot/var/audit     12742884      88 12742796     0%       9 25485592    0%   /var/audit
zroot/var/crash     12742884      88 12742796     0%       8 25485592    0%   /var/crash
zroot/var/log       12742932     136 12742796     0%      21 25485592    0%   /var/log
zroot/var/mail      12742884      88 12742796     0%       8 25485592    0%   /var/mail
zroot/var/tmp       12742884      88 12742796     0%       8 25485592    0%   /var/tmp
zroot               12742884      88 12742796     0%       7 25485592    0%   /zroot
/dev/da0a            3677780       8  3383552     0%       2   980988    0%   /mnt
root@freebsd:/ # 

restore

1
2
root@freebsd:/mnt # restore rf /home/staf/opensense.dump
root@freebsd:/mnt # 

verify

1
2
3
4
root@freebsd:/mnt # df -ih /mnt
Filesystem    Size    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/da0a     3.5G    1.2G    2.0G    39%     38k  943k    4%   /mnt
root@freebsd:/mnt # 

Label

The installation had a OPNsense_Nano label, underscores are not allowed anymore in label name on FreeBSD 12. So I used OPNsense instead, we'll update /etc/fstab with the new label name.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
root@freebsd:~ # tunefs -L OPNsense /dev/da0a
root@freebsd:~ # gpart show
=>      40  41942960  vtbd0  GPT  (20G)
        40      1024      1  freebsd-boot  (512K)
      1064       984         - free -  (492K)
      2048   4194304      2  freebsd-swap  (2.0G)
   4196352  37744640      3  freebsd-zfs  (18G)
  41940992      2008         - free -  (1.0M)

=>      0  7847280  da0  BSD  (3.7G)
        0  7847280    1  freebsd-ufs  (3.7G)

=>      0  7847280  ufsid/5ce123f5836f7018  BSD  (3.7G)
        0  7847280                       1  freebsd-ufs  (3.7G)

=>      0  7847280  ufs/OPNsense  BSD  (3.7G)
        0  7847280             1  freebsd-ufs  (3.7G)

root@freebsd:~ # 

update /etc/fstab

1
2
3
4
root@freebsd:~ # mount /dev/da0a /mnt
root@freebsd:~ # cd /mnt
root@freebsd:/mnt # cd etc
root@freebsd:/mnt/etc # vi fstab 
1
2
# Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/ufs/OPNsense       /               ufs     rw              1       1

umount

1
2
3
root@freebsd:/mnt/etc # cd
root@freebsd:~ # umount /mnt
root@freebsd:~ # 

test

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
              ______  _____  _____                         
             /  __  |/ ___ |/ __  |                        
             | |  | | |__/ | |  | |___  ___ _ __  ___  ___ 
             | |  | |  ___/| |  | / __|/ _ \ '_ \/ __|/ _ \
             | |__| | |    | |  | \__ \  __/ | | \__ \  __/
             |_____/|_|    |_| /__|___/\___|_| |_|___/\___|

 +=========================================+     @@@@@@@@@@@@@@@@@@@@@@@@@@@@
 |                                         |   @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 |  1. Boot Multi User [Enter]             |   @@@@@                    @@@@@
 |  2. Boot [S]ingle User                  |       @@@@@            @@@@@    
 |  3. [Esc]ape to loader prompt           |    @@@@@@@@@@@       @@@@@@@@@@@
 |  4. Reboot                              |         \\\\\         /////     
 |                                         |   ))))))))))))       (((((((((((
 |  Options:                               |         /////         \\\\\     
 |  5. [K]ernel: kernel (1 of 2)           |    @@@@@@@@@@@       @@@@@@@@@@@
 |  6. Configure Boot [O]ptions...         |       @@@@@            @@@@@    
 |                                         |   @@@@@                    @@@@@
 |                                         |   @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 |                                         |   @@@@@@@@@@@@@@@@@@@@@@@@@@@@  
 +=========================================+                                  
                                             19.1  ``Inspiring Iguana''   

/boot/kernel/kernel text=0x1406faf data=0xf2af4+0x2abeec syms=[0x4+0xf9f50+0x4+0x1910f5]
/boot/entropy size=0x1000
/boot/kernel/carp.ko text=0x7f90 data=0x374+0x74 syms=[0x4+0xeb0+0x4+0xf40]
/boot/kernel/if_bridge.ko text=0x7b74 data=0x364+0x3c syms=[0x4+0x1020+0x4+0x125f]
loading required module 'bridgestp'
/boot/kernel/bridgestp.ko text=0x4878 data=0xe0+0x18 syms=[0x4+0x6c0+0x4+0x65c]
/boot/kernel/if_enc.ko text=0x1198 data=0x2b8+0x8 syms=[0x4+0x690+0x4+0x813]
/boot/kernel/if_gre.ko text=0x31d8 data=0x278+0x30 syms=[0x4+0xa30+0x4+0xab8]
<snip>
Root file system: /dev/ufs/OPNsense
Sun May 19 10:28:00 UTC 2019

*** OPNsense.stafnet: OPNsense 19.1.6 (i386/OpenSSL) ***
<snip>

 HTTPS: SHA256 E8 9F B2 8B BE F9 D7 2D 00 AD D3 D5 60 E3 77 53
           3D AC AB 81 38 E4 D2 75 9E 04 F9 33 FF 76 92 28
 SSH:   SHA256 FbjqnefrisCXn8odvUSsM8HtzNNs+9xR/mGFMqHXjfs (ECDSA)
 SSH:   SHA256 B6R7GRL/ucRL3JKbHL1OGsdpHRDyotYukc77jgmIJjQ (ED25519)
 SSH:   SHA256 8BOmgp8lSFF4okrOUmL4YK60hk7LTg2N08Hifgvlq04 (RSA)

FreeBSD/i386 (OPNsense.stafnet) (ttyu0)

login: 
1
2
3
4
5
6
7
8
9
10
root@OPNsense:~ # df -ih
Filesystem           Size    Used   Avail Capacity iused ifree %iused  Mounted on
/dev/ufs/OPNsense    3.5G    1.2G    2.0G    39%     38k  943k    4%   /
devfs                1.0K    1.0K      0B   100%       0     0  100%   /dev
tmpfs                307M     15M    292M     5%     203  2.1G    0%   /var
tmpfs                292M     88K    292M     0%      27  2.1G    0%   /tmp
devfs                1.0K    1.0K      0B   100%       0     0  100%   /var/unbound/dev
devfs                1.0K    1.0K      0B   100%       0     0  100%   /var/dhcpd/dev
root@OPNsense:~ # 
CTRL-A Z for help | 115200 8N1 | NOR | Minicom 2.7.1 | VT102 | Offline | ttyUSB0                                    

update

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
login: root
Password:
----------------------------------------------
|      Hello, this is OPNsense 19.1          |         @@@@@@@@@@@@@@@
|                                            |        @@@@         @@@@
| Website:      https://opnsense.org/        |         @@@\\\   ///@@@
| Handbook:     https://docs.opnsense.org/   |       ))))))))   ((((((((
| Forums:       https://forum.opnsense.org/  |         @@@///   \\\@@@
| Lists:        https://lists.opnsense.org/  |        @@@@         @@@@
| Code:         https://github.com/opnsense  |         @@@@@@@@@@@@@@@
----------------------------------------------

*** OPNsense.stafnet: OPNsense 19.1.6 (i386/OpenSSL) ***
<snip>

 HTTPS: SHA256 E8 9F B2 8B BE F9 D7 2D 00 AD D3 D5 60 E3 77 53
               3D AC AB 81 38 E4 D2 75 9E 04 F9 33 FF 76 92 28
 SSH:   SHA256 FbjqnefrisCXn8odvUSsM8HtzNNs+9xR/mGFMqHXjfs (ECDSA)
 SSH:   SHA256 B6R7GRL/ucRL3JKbHL1OGsdpHRDyotYukc77jgmIJjQ (ED25519)
 SSH:   SHA256 8BOmgp8lSFF4okrOUmL4YK60hk7LTg2N08Hifgvlq04 (RSA)

  0) Logout                              7) Ping host
  1) Assign interfaces                   8) Shell
  2) Set interface IP address            9) pfTop
  3) Reset the root password            10) Firewall log
  4) Reset to factory defaults          11) Reload all services
  5) Power off system                   12) Update from console
  6) Reboot system                      13) Restore a backup

Enter an option: 12

Fetching change log information, please wait... done

This will automatically fetch all available updates, apply them,
and reboot if necessary.

This update requires a reboot.

Proceed with this action? [y/N]: y 

Have fun!

Links

21 May 2019 5:20pm GMT

feedPlanet Lisp

Quicklisp news: May 2019 Quicklisp dist update now available

New projects:

Updated projects: 3d-vectors, also-alsa, april, architecture.builder-protocol, array-operations, asdf-viz, binary-io, bobbin, bst, cacle, cepl.drm-gbm, ceramic, chanl, cl+ssl, cl-abnf, cl-ana, cl-autowrap, cl-collider, cl-conllu, cl-db3, cl-dbi, cl-decimals, cl-egl, cl-emb, cl-enchant, cl-environments, cl-fond, cl-freetype2, cl-general-accumulator, cl-glfw3, cl-gobject-introspection, cl-html5-parser, cl-i18n, cl-json-pointer, cl-las, cl-mango, cl-murmurhash, cl-neovim, cl-netpbm, cl-notebook, cl-patterns, cl-png, cl-ppcre, cl-prevalence, cl-project, cl-pslib, cl-pslib-barcode, cl-random-forest, cl-readline, cl-reddit, cl-sasl, cl-scsu, cl-skkserv, cl-smtp, cl-stomp, cl-str, cl-torrents, cl-unicode, cl-who, cl-yesql, clack, clobber, closer-mop, clx, command-line-arguments, common-lisp-jupyter, contextl, croatoan, curry-compose-reader-macros, cxml-stp, data-lens, datafly, dbus, deploy, dexador, drakma, easy-audio, eclector, esrap, eventbus, fare-quasiquote, femlisp, flare, for, fxml, game-math, generic-cl, glsl-toolkit, golden-utils, harmony, helambdap, hu.dwim.asdf, inlined-generic-function, inquisitor, jp-numeral, kenzo, lack, lass, lift, lisp-chat, lisp-critic, lispbuilder, listopia, literate-lisp, livesupport, markdown.cl, mcclim, metatilities-base, mito, mito-attachment, mmap, modf, modularize-interfaces, nodgui, oclcl, open-location-code, org-davep-dict, org-davep-dictrepl, osc, overlord, parachute, parsley, petalisp, piggyback-parameters, pjlink, plexippus-xpath, postmodern, protest, pzmq, ql-checkout, qlot, qmynd, qtools-ui, query-fs, quilc, quri, qvm, random-state, read-number, replic, restricted-functions, rove, rpcq, s-xml-rpc, sc-extensions, sel, serapeum, simple-rgb, simplified-types, sly, solid-engine, spinneret, split-sequence, staple, static-dispatch, stumpwm, sucle, sxql, trace-db, trivia, trivial-cltl2, trivial-continuation, trivial-garbage, trivial-json-codec, trivial-monitored-thread, trivial-object-lock, trivial-signal, trivial-timer, trivial-utilities, trivial-variable-bindings, type-i, type-r, uiop, usocket, vernacular, with-c-syntax, wordnet.

Removed projects: cffi-objects, cl-blapack, common-lisp-stat, fnv, lisp-matrix, qtools-commons, simple-gui, trivia.balland2006.

A number of projects stopped working because of internal updates to SBCL - projects which used them have not been updated to use supported interfaces instead.

To get this update, use (ql:update-dist "quicklisp"). Enjoy!

21 May 2019 1:19pm GMT

16 May 2019

feedPlanet Grep

Xavier Mertens: [SANS ISC] The Risk of Authenticated Vulnerability Scans

I published the following diary on isc.sans.edu: "The Risk of Authenticated Vulnerability Scans":

NTLM relay attacks have been a well-known opportunity to perform attacks against Microsoft Windows environments for a while and they remain usually successful. The magic with NTLM relay attacks? You don't need to lose time to crack the hashes, just relay them to the victim machine. To achieve this, we need a "responder" that will capture the authentication session on a system and relay it to the victim. A lab is easy to setup: Install the Responder framework. The framework contains a tool called MultiRelay.py which helps to relay the captured NTLM authentication to a specific target and, if the attack is successful, execute some code! (There are plenty of blog posts that explain in details how to (ab)use of this attack scenario)… [Read more]

[The post [SANS ISC] The Risk of Authenticated Vulnerability Scans has been first published on /dev/random]

16 May 2019 11:21am GMT

30 Apr 2019

feedPlanet Lisp

Lispers.de: Lisp-Meetup in Hamburg on Monday, 6th May 2019

We meet at Ristorante Opera, Dammtorstraße 7, Hamburg, starting around 19:00 CET on 6th May 2019.

Christian was at ELS and will report, and we will talk about our attempts at a little informal language benchmark.

This is an informal gathering of Lispers of all experience levels.

Update: the fine folks from stk-hamburg.de will be there and talk about their Lisp-based work!

30 Apr 2019 12:00am GMT

26 Apr 2019

feedPlanet Sun

First Image of a Black Hole – Event Horizon

The Event Horizon Telescope (EHT) - a planet-scale array of 8 ground-based radio telescopes and part of an international collaboration - captured the first image of a black hole. On April 10th 2019, EHT researchers disclosed the first direct visual evidence of a supermassive black hole in the heart of the Galaxy Messier 87.

26 Apr 2019 2:32am GMT

23 Apr 2019

feedPlanet Lisp

Paul Khuong: Fractional Set Covering With Experts

Last winter break, I played with one of the annual capacitated vehicle routing problem (CVRP) "Santa Claus" contests. Real world family stuff took precedence, so, after the obvious LKH with Concorde polishing for individual tours, I only had enough time for one diversification moonshot. I decided to treat the high level problem of assembling prefabricated routes as a set covering problem: I would solve the linear programming (LP) relaxation for the min-cost set cover, and use randomised rounding to feed new starting points to LKH. Add a lot of luck, and that might just strike the right balance between solution quality and diversity.

Unsurprisingly, luck failed to show up, but I had ulterior motives: I'm much more interested in exploring first order methods for relaxations of combinatorial problems than in solving CVRPs. The routes I had accumulated after a couple days turned into a set covering LP with 1.1M decision variables, 10K constraints, and 20M nonzeros. That's maybe denser than most combinatorial LPs (the aspect ratio is definitely atypical), but 0.2% non-zeros is in the right ballpark.

As soon as I had that fractional set cover instance, I tried to solve it with a simplex solver. Like any good Googler, I used Glop... and stared at a blank terminal for more than one hour.

Having observed that lack of progress, I implemented the toy I really wanted to try out: first order online "learning with experts" (specifically, AdaHedge) applied to LP optimisation. I let this not-particularly-optimised serial CL code run on my 1.6 GHz laptop for 21 hours, at which point the first order method had found a 4.5% infeasible solution (i.e., all the constraints were satisfied with \(\ldots \geq 0.955\) instead of \(\ldots \geq 1\)). I left Glop running long after the contest was over, and finally stopped it with no solution after more than 40 days on my 2.9 GHz E5.

Given the shape of the constraint matrix, I would have loved to try an interior point method, but all my licenses had expired, and I didn't want to risk OOMing my workstation. Erling Andersen was later kind enough to test Mosek's interior point solver on it. The runtime was much more reasonable: 10 minutes on 1 core, and 4 on 12 cores, with the sublinear speed-up mostly caused by the serial crossover to a simplex basis.

At 21 hours for a naïve implementation, the "learning with experts" first order method isn't practical yet, but also not obviously uninteresting, so I'll write it up here.

Using online learning algorithms for the "experts problem" (e.g., Freund and Schapire's Hedge algorithm) to solve linear programming feasibility is now a classic result; Jeremy Kun has a good explanation on his blog. What's new here is:

  1. Directly solving the optimisation problem.
  2. Confirming that the parameter-free nature of AdaHedge helps.

The first item is particularly important to me because it's a simple modification to the LP feasibility meta-algorithm, and might make the difference between a tool that's only suitable for theoretical analysis and a practical approach.

I'll start by reviewing the experts problem, and how LP feasibility is usually reduced to the former problem. After that, I'll cast the reduction as a surrogate relaxation method, rather than a Lagrangian relaxation; optimisation should flow naturally from that point of view. Finally, I'll guess why I had more success with AdaHedge this time than with Multiplicative Weight Update eight years ago.1

The experts problem and LP feasibility

I first heard about the experts problem while researching dynamic sorted set data structures: Igal Galperin's PhD dissertation describes scapegoat trees, but is really about online learning with experts. Arora, Hazan, and Kale's 2012 survey of multiplicative weight update methods. is probably a better introduction to the topic ;)

The experts problem comes in many variations. The simplest form sounds like the following. Assume you're playing a binary prediction game over a predetermined number of turns, and have access to a fixed finite set of experts at each turn. At the beginning of every turn, each expert offers their binary prediction (e.g., yes it will rain today, or it will not rain today). You then have to make a prediction yourself, with no additional input. The actual result (e.g., it didn't rain today) is revealed at the end of the turn. In general, you can't expect to be right more often than the best expert at the end of the game. Is there a strategy that bounds the "regret," how many more wrong prediction you'll make compared to the expert(s) with the highest number of correct predictions, and in what circumstances?

Amazingly enough, even with an omniscient adversary that has access to your strategy and determines both the experts' predictions and the actual result at the end of each turn, a stream of random bits (hidden from the adversary) suffice to bound our expected regret in \(\mathcal{O}(\sqrt{T}\,\lg n)\), where \(T\) is the number of turns and \(n\) the number of experts.

I long had trouble with that claim: it just seems too good of a magic trick to be true. The key realisation for me was that we're only comparing against invidivual experts. If each expert is a move in a matrix game, that's the same as claiming you'll never do much worse than any pure strategy. One example of a pure strategy is always playing rock in Rock-Paper-Scissors; pure strategies are really bad! The trick is actually in making that regret bound useful.

We need a more continuous version of the experts problem for LP feasibility. We're still playing a turn-based game, but, this time, instead of outputting a prediction, we get to "play" a mixture of the experts (with non-negative weights that sum to 1). At the beginning of each turn, we describe what weight we'd like to give to each experts (e.g., 60% rock, 40% paper, 0% scissors). The cost (equivalently, payoff) for each expert is then revealed (e.g., \(\mathrm{rock} = -0.5\), \(\mathrm{paper} = 0.5\), \(\mathrm{scissors} = 0\)), and we incur the weighted average from our play (e.g., \(60\% \cdot -0.5 + 40\% \cdot 0.5 = -0.1\)) before playing the next round.2 The goal is to minimise our worst-case regret, the additive difference between the total cost incurred by our mixtures of experts and that of the a posteriori best single expert. In this case as well, online learning algorithms guarantee regret in \(\mathcal{O}(\sqrt{T} \, \lg n)\)

This line of research is interesting because simple algorithms achieve that bound, with explicit constant factors on the order of 1,3 and those bounds are known to be non-asymptotically tight for a large class of algorithms. Like dense linear algebra or fast Fourier transforms, where algorithms are often compared by counting individual floating point operations, online learning has matured into such tight bounds that worst-case regret is routinely presented without Landau notation. Advances improve constant factors in the worst case, or adapt to easier inputs in order to achieve "better than worst case" performance.

The reduction below lets us take any learning algorithm with an additive regret bound, and convert it to an algorithm with a corresponding worst-case iteration complexity bound for \(\varepsilon\)-approximate LP feasibility. An algorithm that promises low worst-case regret in \(\mathcal{O}(\sqrt{T})\) gives us an algorithm that needs at most \(\mathcal{O}(1/\varepsilon\sp{2})\) iterations to return a solution that almost satisfies every constraint in the linear program, where each constraint is violated by \(\varepsilon\) or less (e.g., \(x \leq 1\) is actually \(x \leq 1 + \varepsilon\)).

We first split the linear program in two components, a simple domain (e.g., the non-negative orthant or the \([0, 1]\sp{d}\) box) and the actual linear constraints. We then map each of the latter constraints to an expert, and use an arbitrary algorithm that solves our continuous version of the experts problem as a black box. At each turn, the black box will output a set of non-negative weights for the constraints (experts). We will average the constraints using these weights, and attempt to find a solution in the intersection of our simple domain and the weighted average of the linear constraints.

Let's use Stigler's Diet Problem with three foods and two constraints as a small example, and further simplify it by disregarding the minimum value for calories, and the maximum value for vitamin A. Our simple domain here is at least the non-negative orthant: we can't ingest negative food. We'll make things more interesting by also making sure we don't eat more than 10 servings of any food per day.

The first constraint says we mustn't get too many calories

\[72 x\sb{\mathrm{corn}} + 121 x\sb{\mathrm{milk}} + 65 x\sb{\mathrm{bread}} \leq 2250,\]

and the second constraint (tweaked to improve this example) ensures we ge enough vitamin A

\[107 x\sb{\mathrm{corn}} + 400 x\sb{\mathrm{milk}} \geq 5000,\]

or, equivalently,

\[-107 x\sb{\mathrm{corn}} - 400 x\sb{\mathrm{milk}} \leq -5000,\]

Given weights \([¾, ¼]\), the weighted average of the two constraints is

\[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\]

where the coefficients for each variable and for the right-hand side were averaged independently.

The subproblem asks us to find a feasible point in the intersection of these two constraints: \[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

Classically, we claim that this is just Lagrangian relaxation, and find a solution to

\[\min 27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

In the next section, I'll explain why I think this analogy is wrong and worse than useless. For now, we can easily find the maximum one variable at a time, and find the solution \(x\sb{\mathrm{corn}} = 0\), \(x\sb{\mathrm{milk}} = 10\), \(x\sb{\mathrm{bread}} = 0\), with objective value \(-92.5\) (which is \(530\) less than \(437.5\)).

In general, three things can happen at this point. We could discover that the subproblem is infeasible. In that case, the original non-relaxed linear program itself is infeasible: any solution to the original LP satisfies all of its constraints, and thus would also satisfy any weighted average of the same constraints. We could also be extremely lucky and find that our optimal solution to the relaxation is (\(\varepsilon\)-)feasible for the original linear program; we can stop with a solution. More commonly, we have a solution that's feasible for the relaxation, but not for the original linear program.

Since that solution satisfies the weighted average constraint, the black box's payoff for this turn (and for every other turn) is non-positive. In the current case, the first constraint (on calories) is satisfied by \(1040\), while the second (on vitamin A) is violated by \(1000\). On weighted average, the constraints are satisfied by \(\frac{1}{4}(3 \cdot 1040 - 1000) = 530.\) Equivalently, they're violated by \(-530\) on average.

We'll add that solution to an accumulator vector that will come in handy later.

The next step is the key to the reduction: we'll derive payoffs (negative costs) for the black box from the solution to the last relaxation. Each constraint (expert) has a payoff equal to its level of violation in the relaxation's solution. If a constraint is strictly satisfied, the payoff is negative; for example, the constraint on calories is satisfied by \(1040\), so its payoff this turn is \(-1040\). The constraint on vitamin A is violated by \(1000\), so its payoff this turn is \(1000\). Next turn, we expect the black box to decrease the weight of the constraint on calories, and to increase the weight of the one on vitamin A.

After \(T\) turns, the total payoff for each constraint is equal to the sum of violations by all solutions in the accumulator. Once we divide both sides by \(T\), we find that the divided payoff for each constraint is equal to its violation by the average of the solutions in the accumulator. For example, if we have two solutions, one that violates the calories constraint by \(500\) and another that satisfies it by \(1000\) (violates it by \(-1000\)), the total payoff for the calories constraint is \(-500\), and the average of the two solutions does strictly satisfy the linear constraint by \(\frac{500}{2} = 250\)!

We also know that we only generated feasible solutions to the relaxed subproblem (otherwise, we'd have stopped and marked the original LP as infeasible), so the black box's total payoff is \(0\) or negative.

Finally, we assumed that the black box algorithm guarantees an additive regret in \(\mathcal{O}(\sqrt{T}\, \lg n)\), so the black box's payoff of (at most) \(0\) means that any constraint's payoff is at most \(\mathcal{O}(\sqrt{T}\, \lg n)\). After dividing by \(T\), we obtain a bound on the violation by the arithmetic mean of all solutions in the accumulator: for all constraint, that violation is in \(\mathcal{O}\left(\frac{\lg n}{\sqrt{T}}\right)\). In other words, the number of iteration \(T\) must scale with \(\mathcal{O}\left(\frac{\lg n}{\varepsilon\sp{2}}\right)\), which isn't bad when \(n\) is in the millions but \(\varepsilon \approx 0.01\).

Theoreticians find this reduction interesting because there are concrete implementations of the black box, e.g., the multiplicative weight update (MWU) method with non-asymptotic bounds. For many problems, this makes it possible to derive the exact number of iterations necessary to find an \(\varepsilon-\)feasible fractional solution, given \(\varepsilon\) and the instance's size (but not the instance itself).

That's why algorithms like MWU are theoretically useful tools for fractional approximations, when we already have subgradient methods that only need \(\mathcal{O}\left(\frac{1}{\varepsilon}\right)\) iterations: state-of-the-art algorithms for learning with experts explicit non-asymptotic regret bounds that yield, for many problems, iteration bounds that only depend on the instance's size, but not its data. While the iteration count when solving LP feasibility with MWU scales with \(\frac{1}{\varepsilon\sp{2}}\), it is merely proportional to \(\lg n\), the log of the the number of linear constraints. That's attractive, compared to subgradient methods for which the iteration count scales with \(\frac{1}{\varepsilon}\), but also scales linearly with respect to instance-dependent values like the distance between the initial dual solution and the optimum, or the Lipschitz constant of the Lagrangian dual function; these values are hard to bound, and are often proportional to the square root of the number of constraints. Given the choice between \(\mathcal{O}\left(\frac{\lg n}{\varepsilon\sp{2}}\right)\) iterations with explicit constants, and a looser \(\mathcal{O}\left(\frac{\sqrt{n}}{\varepsilon}\right)\), it's obvious why MWU and online learning are powerful additions to the theory toolbox.

Theoreticians are otherwise not concerned with efficiency, so the usual answer to someone asking about optimisation is to tell them they can always reduce linear optimisation to feasibility with a binary search on the objective value. I once made the mistake of implementing that binary search last strategy. Unsurprisingly, it wasn't useful. I also tried another theoretical reduction, where I looked for a pair of primal and dual -feasible solutions that happened to have the same objective value. That also failed, in a more interesting manner: since the two solution had to have almost the same value, the universe spited me by sending back solutions that were primal and dual infeasible in the worst possible way. In the end, the second reduction generated fractional solutions that were neither feasible nor superoptimal, which really isn't helpful.

Direct linear optimisation with experts

The reduction above works for any "simple" domain, as long as it's convex and we can solve the subproblems, i.e., find a point in the intersection of the simple domain and a single linear constraint or determine that the intersection is empty.

The set of (super)optimal points in some initial simple domain is still convex, so we could restrict our search to the search of the domain that is superoptimal for the linear program we wish to optimise, and directly reduce optimisation to the feasibility problem solved in the last section, without binary search.

That sounds silly at first: how can we find solutions that are superoptimal when we don't even know the optimal value?

Remember that the subproblems are always relaxations of the original linear program. We can port the objective function from the original LP over to the subproblems, and optimise the relaxations. Any solution that's optimal for a realxation must have an optimal or superoptimal value for the original LP.

Rather than treating the black box online solver as a generator of Lagrangian dual vectors, we're using its weights as solutions to the surrogate relaxation dual. The latter interpretation isn't just more powerful by handling objective functions. It also makes more sense: the weights generated by algorithms for the experts problem are probabilities, i.e., they're non-negative and sum to \(1\). That's also what's expected for surrogate dual vectors, but definitely not the case for Lagrangian dual vectors, even when restricted to \(\leq\) constraints.

We can do even better!

Unlike Lagrangian dual solvers, which only converge when fed (approximate) subgradients and thus make us (nearly) optimal solutions to the relaxed subproblems, our reduction to the experts problem only needs feasible solutions to the subproblems. That's all we need to guarantee an \(\varepsilon-\)feasible solution to the initial problem in a bounded number of iterations. We also know exactly how that \(\varepsilon-\)feasible solution is generated: it's the arithmetic mean of the solutions for relaxed subproblems.

This lets us decouple finding lower bounds from generating feasible solutions that will, on average, \(\varepsilon-\)satisfy the original LP. In practice, the search for an \(\varepsilon-\)feasible solution that is also superoptimal will tend to improve the lower bound. However, nothing forces us to evaluate lower bounds synchronously, or to only use the experts problem solver to improve our bounds.

We can find a new bound from any vector of non-negative constraint weights: they always yield a valid surrogate relaxation. We can solve that relaxation, and update our best bound when it's improved. The Diet subproblem earlier had

\[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

Adding the original objective function back yields the linear program

\[\min 0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\] subject to \[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10,\]

which has a trivial optimal solution at \([0, 0, 0]\).

When we generate a feasible solution for the same subproblem, we can use any valid bound on the objective value to find the most feasible solution that is also assuredly (super)optimal. For example, if some oracle has given us a lower bound of \(2\) for the original Diet problem, we can solve for

\[\min 27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\leq 2\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

We can relax the objective value constraint further, since we know that the final \(\varepsilon-\)feasible solution is a simple arithmetic mean. Given the same best bound of \(2\), and, e.g., a current average of \(3\) solutions with a value of \(1.9\), a new solution with an objective value of \(2.3\) (more than our best bound, so not necessarily optimal!) would yield a new average solution with a value of \(2\), which is still (super)optimal. This means we can solve the more relaxed subproblem

\[\min 27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\leq 2.3\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

Given a bound on the objective value, we swapped the constraint and the objective; the goal is to maximise feasibility, while generating a solution that's "good enough" to guarantee that the average solution is still (super)optimal.

For box-constrained linear programs where the box is the convex domain, subproblems are bounded linear knapsacks, so we can simply stop the greedy algorithm as soon as the objective value constraint is satisfied, or when the knapsack constraint becomes active (we found a better bound).

This last tweak doesn't just accelerate convergence to \(\varepsilon-\)feasible solutions. More importantly for me, it pretty much guarantees that out \(\varepsilon-\)feasible solution matches the best known lower bound, even if that bound was provided by an outside oracle. Bundle methods and the Volume algorithm can also mix solutions to relaxed subproblems in order to generate \(\varepsilon-\)feasible solutions, but the result lacks the last guarantee: their fractional solutions are even more superoptimal than the best bound, and that can make bounding and variable fixing difficult.

The secret sauce: AdaHedge

Before last Christmas's CVRP set covering LP, I had always used the multiplicative weight update (MWU) algorithm as my black box online learning algorithm: it wasn't great, but I couldn't find anything better. The two main downsides for me were that I had to know a "width" parameter ahead of time, as well as the number of iterations I wanted to run.

The width is essentially the range of the payoffs; in our case, the potential level of violation or satisfaction of each constraints by any solution to the relaxed subproblems. The dependence isn't surprising: folklore in Lagrangian relaxation also says that's a big factor there. The problem is that the most extreme violations and satisfactions are initialisation parameters for the MWU algorithm, and the iteration count for a given \(\varepsilon\) is quadratic in the width (\(\mathrm{max_violation} \cdot \mathrm{max_satisfaction}\)).

What's even worse is that the MWU is explicitly tuned for a specific iteration count. If I estimate that, give my worst-case width estimate, one million iterations will be necessary to achieve \(\varepsilon-\)feasibility, MWU tuned for 1M iterations will need 1M iterations, even if the actual width is narrower.

de Rooij and others published AdaHedge in 2013, an algorithm that addresses both these issues by smoothly estimating its parameter over time, without using the doubling trick.4 AdaHedge's loss (convergence rate to an \(\varepsilon-\)solution) still depends on the relaxation's width. However, it depends on the maximum width actually observed during the solution process, and not on any explicit worst-case bound. It's also not explicily tuned for a specific iteration count, and simply keeps improving at a rate that roughly matches MWU. If the instance happens to be easy, we will find an \(\varepsilon-\)feasible solution more quickly. In the worst case, the iteration count is never much worse than that of an optimally tuned MWU.

These 400 lines of Common Lisp implement AdaHedge and use it to optimise the set covering LP. AdaHedge acts as the online blackk box solver for the surrogate dual problem, the relaxed set covering LP is a linear knapsack, and each subproblem attempts to improve the lower bound before maximising feasibility.

When I ran the code, I had no idea how long it would take to find a feasible enough solution: covering constraints can never be violated by more than \(1\), but some points could be covered by hundreds of tours, so the worst case satisfaction width is high. I had to rely on the way AdaHedge adapts to the actual hardness of the problem. In the end, \(34492\) iterations sufficed to find a solution that was \(4.5\%\) infeasible.5 This corresponds to a worst case with a width of less than \(2\), which is probably not what happened. It seems more likely that the surrogate dual isn't actually an omniscient adversary, and AdaHedge was able to exploit some of that "easiness."

The iterations themselves are also reasonable: one sparse matrix / dense vector multiplication to convert surrogate dual weights to an average constraint, one solve of the relaxed LP, and another sparse matrix / dense vector multiplication to compute violations for each constraint. The relaxed LP is a fractional \([0, 1]\) knapsack, so the bottleneck is sorting double floats. Each iteration took 1.8 seconds on my old laptop; I'm guessing that could easily be 10-20 times faster with vectorisation and parallelisation.

In another post, I'll show how using the same surrogate dual optimisation algorithm to mimick Lagrangian decomposition instead of Lagrangian relaxation guarantees an iteration count in \(\mathcal{O}\left(\frac{\lg \#\mathrm{nonzero}}{\varepsilon\sp{2}}\right)\) independently of luck or the specific linear constraints.


  1. Yes, I have been banging my head against that wall for a while.

  2. This is equivalent to minimising expected loss with random bits, but cleans up the reduction.

  3. When was the last time you had to worry whether that log was natural or base-2?

  4. The doubling trick essentially says to start with an estimate for some parameters (e.g., width), then adjust it to at least double the expected iteration count when the parameter's actual value exceeds the estimate. The sum telescopes and we only pay a constant multiplicative overhead for the dynamic update.

  5. I think I computed the \(\log\) of the number of decision variables instead of the number of constraints, so maybe this could have gone a bit better.

23 Apr 2019 10:05pm GMT

04 Nov 2018

feedPlanet Sun

5 Budget-Friendly Telescopes You Can Choose For Viewing Planets

Socrates couldn't have been more right when he said: "I know one thing, that I know nothing." Even with all of the advancements we, as a species, have made in this world, it's still nothing compared to countless of wonders waiting to be discovered in the vast universe. If you've recently developed an interest in ... Read more

04 Nov 2018 1:27pm GMT

20 May 2012

feedPlanet Sun

Annular Solar Eclipse on Sunday, May 20th 2012

On Sunday, May 20th 2012, people in a narrow strip from Japan to the western United States will be able to see an annular solar eclipse, the first in 18 years. The moon will cover as much as 94% of the sun. An Annular Solar Eclipse is different from a Total Solar Eclipse, when the ... Read more

20 May 2012 9:51pm GMT

10 Nov 2011

feedPlanetJava

OSDir.com - Java: Oracle Introduces New Java Specification Requests to Evolve Java Community Process

From the Yet Another dept.:

To further its commitment to the Java Community Process (JCP), Oracle has submitted the first of two Java Specification Requests (JSRs) to update and revitalize the JCP.

10 Nov 2011 6:01am GMT

OSDir.com - Java: No copied Java code or weapons of mass destruction found in Android

From the Fact Checking dept.:

ZDNET: Sometimes the sheer wrongness of what is posted on the web leaves us speechless. Especially when it's picked up and repeated as gospel by otherwise reputable sites like Engadget. "Google copied Oracle's Java code, pasted in a new license, and shipped it," they reported this morning.



Sorry, but that just isn't true.

10 Nov 2011 6:01am GMT

OSDir.com - Java: Java SE 7 Released

From the Grande dept.:

Oracle today announced the availability of Java Platform, Standard Edition 7 (Java SE 7), the first release of the Java platform under Oracle stewardship.

10 Nov 2011 6:01am GMT

08 Nov 2011

feedfosdem - Google Blog Search

papupapu39 (papupapu39)'s status on Tuesday, 08-Nov-11 00:28 ...

papupapu39 · http://identi.ca/url/56409795 #fosdem #freeknowledge #usamabinladen · about a day ago from web. Help · About · FAQ · TOS · Privacy · Source · Version · Contact. Identi.ca is a microblogging service brought to you by Status.net. ...

08 Nov 2011 12:28am GMT

05 Nov 2011

feedfosdem - Google Blog Search

Write and Submit your first Linux kernel Patch | HowLinux.Tk ...

FOSDEM (Free and Open Source Development European Meeting) is a European event centered around Free and Open Source software development. It is aimed at developers and all interested in the Free and Open Source news in the world. ...

05 Nov 2011 1:19am GMT

03 Nov 2011

feedfosdem - Google Blog Search

Silicon Valley Linux Users Group – Kernel Walkthrough | Digital Tux

FOSDEM (Free and Open Source Development European Meeting) is a European event centered around Free and Open Source software development. It is aimed at developers and all interested in the Free and Open Source news in the ...

03 Nov 2011 3:45pm GMT

28 Oct 2011

feedPlanet Ruby

O'Reilly Ruby: MacRuby: The Definitive Guide

Ruby and Cocoa on OS X, the iPhone, and the Device That Shall Not Be Named

28 Oct 2011 8:00pm GMT

14 Oct 2011

feedPlanet Ruby

Charles Oliver Nutter: Why Clojure Doesn't Need Invokedynamic (Unless You Want It to be More Awesome)

This was originally posted as a comment on @fogus's blog post "Why Clojure doesn't need invokedynamic, but it might be nice". I figured it's worth a top-level post here.

Ok, there's some good points here and a few misguided/misinformed positions. I'll try to cover everything.

First, I need to point out a key detail of invokedynamic that may have escaped notice: any case where you must bounce through a generic piece of code to do dispatch -- regardless of how fast that bounce may be -- prevents a whole slew of optimizations from happening. This might affect Java dispatch, if there's any argument-twiddling logic shared between call sites. It would definitely affect multimethods, which are using a hand-implemented PIC. Any case where there's intervening code between the call site and the target would benefit from invokedynamic, since invokedynamic could be used to plumb that logic and let it inline straight through. This is, indeed, the primary benefit of using invokedynamic: arbitrarily complex dispatch logic folds away allowing the dispatch to optimize as if it were direct.

Your point about inference in Java dispatch is a fair one...if Clojure is able to infer all cases, then there's no need to use invokedynamic at all. But unless Clojure is able to infer all cases, then you've got this little performance time bomb just waiting to happen. Tweak some code path and obscure the inference, and kablam, you're back on a slow reflective impl. Invokedynamic would provide a measure of consistency; the only unforeseen perf impact would be when the dispatch turns out to *actually* be polymorphic, in which case even a direct call wouldn't do much better.

For multimethods, the benefit should be clear: the MM selection logic would be mostly implemented using method handles and "leaf" logic, allowing hotspot to inline it everywhere it is used. That means for small-morphic MM call sites, all targets could potentially inline too. That's impossible without invokedynamic unless you generate every MM path immediately around the eventual call.

Now, on to defs and Var lookup. Depending on the cost of Var lookup, using a SwitchPoint-based invalidation plus invokedynamic could be a big win. In Java 7u2, SwitchPoint-based invalidation is essentially free until invalidated, and as you point out that's a rare case. There would essentially be *no* cost in indirecting through a var until that var changes...and then it would settle back into no cost until it changes again. Frequently-changing vars could gracefully degrade to a PIC.

It's also dangerous to understate the impact code size has on JVM optimization. The usual recommendation on the JVM is to move code into many small methods, possibly using call-through logic as in multimethods to reuse the same logic in many places. As I've mentioned, that defeats many optimizations, so the next approach is often to hand-inline logic everywhere it's used, to let the JVM have a more optimizable view of the system. But now we're stepping on our own feet...by adding more bytecode, we're almost certainly impacting the JVM's optimization and inlining budgets.

OpenJDK (and probably the other VMs too) has various limits on how far it will go to optimize code. A large number of these limits are based on the bytecoded size of the target methods. Methods that get too big won't inline, and sometimes won't compile. Methods that inline a lot of code might not get inlined into other methods. Methods that inline one path and eat up too much budget might push out more important calls later on. The only way around this is to reduce bytecode size, which is where invokedynamic comes in.

As of OpenJDK 7u2, MethodHandle logic is not included when calculating inlining budgets. In other words, if you push all the Java dispatch logic or multimethod dispatch logic or var lookup into mostly MethodHandles, you're getting that logic *for free*. That has had a tremendous impact on JRuby performance; I had previous versions of our compiler that did indeed infer static target methods from the interpreter, but they were often *slower* than call site caching solely because the code was considerably larger. With invokedynamic, a call is a call is a call, and the intervening plumbing is not counted against you.

Now, what about negative impacts to Clojure itself...

#0 is a red herring. JRuby supports Java 5, 6, and 7 with only a few hundred lines of changes in the compiler. Basically, the compiler has abstract interfaces for doing things like constant lookup, literal loading, and dispatch that we simply reimplement to use invokedynamic (extending the old non-indy logic for non-indified paths). In order to compile our uses of invokedynamic, we use Rémi Forax's JSR-292 backport, which includes a "mock" jar with all the invokedynamic APIs stubbed out. In our release, we just leave that library out, reflectively load the invokedynamic-based compiler impls, and we're off to the races.

#1 would be fair if the Oracle Java 7u2 early-access drops did not already include the optimizations that gave JRuby those awesome numbers. The biggest of those optimizations was making SwitchPoint free, but also important are the inlining discounting and MutableCallSite improvements. The perf you see for JRuby there can apply to any indirected behavior in Clojure, with the same perf benefits as of 7u2.

For #2, to address the apparent vagueness in my blog post...the big perf gain was largely from using SwitchPoint to invalidate constants rather than pinging a global serial number. Again, indirection folds away if you can shove it into MethodHandles. And it's pretty easy to do it.

#3 is just plain FUD. Oracle has committed to making invokedynamic work well for Java too. The current thinking is that "lambda", the support for closures in Java 7, will use invokedynamic under the covers to implement "function-like" constructs. Oracle has also committed to Nashorn, a fully invokedynamic-based JavaScript implementation, which has many of the same challenges as languages like Ruby or Python. I talked with Adam Messinger at Oracle, who explained to me that Oracle chose JavaScript in part because it's so far away from Java...as I put it (and he agreed) it's going to "keep Oracle honest" about optimizing for non-Java languages. Invokedynamic is driving the future of the JVM, and Oracle knows it all too well.

As for #4...well, all good things take a little effort :) I think the effort required is far lower than you suspect, though.

14 Oct 2011 2:40pm GMT

07 Oct 2011

feedPlanet Ruby

Ruby on Rails: Rails 3.1.1 has been released!

Hi everyone,

Rails 3.1.1 has been released. This release requires at least sass-rails 3.1.4

CHANGES

ActionMailer

ActionPack

ActiveModel

ActiveRecord

ActiveResource

ActiveSupport

Railties

SHA-1

You can find an exhaustive list of changes on github. Along with the closed issues marked for v3.1.1.

Thanks to everyone!

07 Oct 2011 5:26pm GMT

26 Jul 2008

feedFOSDEM - Free and Open Source Software Developers' European Meeting

Update your RSS link

If you see this message in your RSS reader, please correct your RSS link to the following URL: http://fosdem.org/rss.xml.

26 Jul 2008 5:55am GMT

25 Jul 2008

feedFOSDEM - Free and Open Source Software Developers' European Meeting

Archive of FOSDEM 2008

These pages have been archived.
For information about the latest FOSDEM edition please check this url: http://fosdem.org

25 Jul 2008 4:43pm GMT

09 Mar 2008

feedFOSDEM - Free and Open Source Software Developers' European Meeting

Slides and videos online

Two weeks after FOSDEM and we are proud to publish most of the slides and videos from this year's edition.

All of the material from the Lightning Talks has been put online. We are still missing some slides and videos from the Main Tracks but we are working hard on getting those completed too.

We would like to thank our mirrors: HEAnet (IE) and Unixheads (US) for hosting our videos, and NamurLUG for quick recording and encoding.

The videos from the Janson room were live-streamed during the event and are also online on the Linux Magazin site.

We are having some synchronisation issues with Belnet (BE) at the moment. We're working to sort these out.

09 Mar 2008 3:12pm GMT