24 Nov 2020

feedPlanet Python

PyCoder’s Weekly: Issue #448 (Nov. 24, 2020)

#448 - NOVEMBER 24, 2020
View in Browser »

The PyCoder’s Weekly Logo


Synthetic Data Vault (SDV): A Python Library for Dataset Modeling

Creating realistic data for testing applications can be difficult, especially when you have complex data requirements and privacy concerns make using real data problematic. Enter Synthetic Data Vault, a tool for modeling datasets that closely preserves important statistics, like mean and standard variation.
ESMAEIL ALIZADEH

Python enumerate(): Simplify Looping With Counters

Once you learn about for loops in Python, you know that using an index to access items in a sequence isn't very Pythonic. So what do you do when you need that index value? In this tutorial, you'll learn all about Python's built-in enumerate(), where it's used, and how you can emulate its behavior.
REAL PYTHON

Profile, Understand & Optimize Code Performance

alt

You can't improve what you can't measure. Profile and understand code behavior and performance (Wall-time, I/O, CPU, HTTP requests, SQL queries). Install in minutes. Browse through appealing graphs. Supports all Python versions. Works in dev, test/staging & production →
BLACKFIRE sponsor

Reproducible and Upgradable Conda Environments: Dependency Management With conda-lock

If your application uses Conda to manage dependencies, you face a dilemma. On the one hand, you want to pin all your dependencies to specific versions, so you get reproducible builds. On the other hand, once you've pinned everything, upgrades become difficult. Enter conda-lock.
ITAMAR TURNER-TRAURING

Regular Expressions and Building Regexes in Python

In this course, you'll learn how to perform more complex string pattern matching using regular expressions, or regexes, in Python. You'll also explore more advanced regex tools and techniques that are available in Python.
REAL PYTHON course

PyInstaller 4.1 Supports Python 3.8 and 3.9

PYINSTALLER.READTHEDOCS.IO

Discussions

My Students Challenged Me to Write the Smallest Graphical User Interface That Includes Actual User Interaction

REDDIT

Advantages of Pattern Matching: A Simple Comparative Analysis

PYTHON.ORG

Python Jobs

Advanced Python Engineer (Newport Beach, CA, USA)

Research Affiliates

Python Developer / Software Engineer (Berlin, Germany)

Thermondo GmbH

Senior Full Stack Developer (Chicago, IL, USA)

Panopta

Senior Software Engineer, Platform (Remote)

Silicon Therapeutics

More Python Jobs >>>

Articles & Tutorials

10 Python Skills They Don't Teach in Bootcamp

Here are ten practical and little-known pandas tips to help you take your skills to the next level.
NICOLE JANEWAY BILLS

Using Python's bisect module

Python's bisect module has tools for searching and inserting values into sorted lists. It's one of his "batteries-included" features that often gets overlooked, but can be a great tool for optimizing certain kinds of code.
JOHN LEKBERG

Python Developers Are in Demand on Vettery

alt

Get discovered by top companies using Vettery to actively grow their tech teams with Python developers (like you). Here's how it works: create a profile, name your salary, and connect with hiring managers at startups to Fortune 500 companies. Sign up today - it's completely free for job-seekers →
VETTERY sponsor

How to Use Serializers in the Django Python Web Framework

Serialization transforms data into a format that can be stored or transmitted and then reconstructs it for use. There are some quick-and-dirty ways to serialize data in pure Python, but you often need to perform more complex actions during the serialization process, like validating data. The Django REST Framework has some particularly robust and full-featured serializers.
RENATO OLIVEIRA

Sentiment Analysis, Fourier Transforms, and More Python Data Science

Are you interested in learning more about Natural Language Processing? Have you heard of sentiment analysis? This week on the show, Kyle Stratis returns to talk about his new article titled, Use Sentiment Analysis With Python to Classify Movie Reviews. David Amos is also here, and all of us cover another batch of PyCoder's Weekly articles and projects.
REAL PYTHON podcast

Formatting Python Strings

In this course, you'll see two items to add to your Python string formatting toolkit. You'll learn about Python's string format method and the formatted string literal, or f-string. You'll learn about these formatting techniques in detail and add them to your Python string formatting toolkit.
REAL PYTHON course

Spend Less Time Debugging, and More Time Building with Scout APM

Scout APM uses tracing logic that ties bottlenecks to source code to give you the performance insights you need in less than 4 minutes! Start your free 14-day trial today and Scout will donate $5 to the OSS of your choice when you deploy.
SCOUT APM sponsor

When You Import a Python Package and It Is Empty

Did you know Python has two different kinds of packages: regular packages and namespace packages? It turns out that trying to import a regular package when you don't have the right permissions causes Python to import it as a namespace package, and some unexpected things happen.
PETR ZEMEK

Python Extensions with Rust and Go

Python extensions are a great way to leverage performance from another language while keeping a friendly Python API. How viable are Rust and Go for writing Python extensions? Are there reasons to use one over the other?
BRUCE ECKEL

Split Your Dataset With scikit-learn's train_test_split()

In this tutorial, you'll learn why it's important to split your dataset in supervised machine learning and how to do that with train_test_split() from scikit-learn.
REAL PYTHON

IPython for Web Devs

This free, open-source book will help you learn more about IPython, a rich toolkit that helps you make the most out of using Python interactively.
ERIC HAMITER

Add a New Dimension to Your Photos Using Python

Learn how to add some motion and a third dimension to a photo using depth estimation and inpainting.
DYLAN ROY

Projects & Code

klio: Smarter Data Pipelines for Audio

GITHUB.COM/SPOTIFY

nbdev: Create Delightful Python Projects Using Jupyter Notebooks

GITHUB.COM/FASTAI

pyo3: Rust Bindings for the Python Interpreter

GITHUB.COM/PYO3

yappi: Yet Another Python Profiler

GITHUB.COM/SUMERC

topalias: Linux Bash/ZSH Aliases Generator

GITHUB.COM/CSREDRAT

eff: Library for Working With Algebraic Effects

GITHUB.COM/ORSINIUM-LABS

SDV: Synthetic Data Generation for Tabular, Relational, Time Series Data

GITHUB.COM/SDV-DEV

Events

Real Python Office Hours (Virtual)

November 25, 2020
REALPYTHON.COM

Pyjamas 2020 (Virtual)

December 5, 2020
PYJAMAS.LIVE

BelPy 2021 (Virtual)

January 30 - 31, 2021
BELPY.IN • Shared by Gajendra Deshpande


Happy Pythoning!
This was PyCoder's Weekly Issue #448.
View in Browser »

alt


[ Subscribe to 🐍 PyCoder's Weekly 💌 - Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

24 Nov 2020 7:30pm GMT

Stack Abuse: Rotate Axis Labels in Matplotlib

Introduction

Matplotlib is one of the most widely used data visualization libraries in Python. Much of Matplotlib's popularity comes from its customization options - you can tweak just about any element from its hierarchy of objects.

In this tutorial, we'll take a look at how to rotate axis text/labels in a Matplotlib plot.

Creating a Plot

Let's create a simple plot first:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)
plt.show()

simple matplotlib plot

Rotate X-Axis Labels in Matplotlib

Now, let's take a look at how we can rotate the X-Axis labels here. There are two ways to go about it - change it on the Figure-level using plt.xticks() or change it on an Axes-level by using tick.set_rotation() individually, or even by using ax.set_xticklabels() and ax.xtick_params().

Let's start off with the first option:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)
plt.xticks(rotation = 45) # Rotates X-Axis Ticks by 45-degrees
plt.show()

Here, we've set the rotation of xticks to 45, signifying a 45-degree tilt, counterclockwise:

rotate x-axis label with xticks

Note: This function, like all others here, should be called after plt.plot(), lest the ticks end up being potentially cropped or misplaced.

Another option would be to get the current Axes object and call ax.set_xticklabels() on it. Here we can set the labels, as well as their rotation:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)

ax = plt.gca()
plt.draw()

ax.set_xticklabels(ax.get_xticks(), rotation = 45)

plt.show()

Note: For this approach to work, you'll need to call plt.draw() before accessing or setting the X tick labels. This is because the labels are populated after the plot is drawn, otherwise, they'll return empty text values.

rotate x axis labels with xticklabels

Alternatively, we could've iterated over the ticks in the ax.get_xticklabels() list. Then, we can call tick.set_rotation() on each of them:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)

ax = plt.gca()
plt.draw()

for tick in ax.get_xticklabels():
    tick.set_rotation(45)
plt.show()

This also results in:

rotate x axis labels with set_rotation

And finally, you can use the ax.tick_params() function and set the label rotation there:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)

ax = plt.gca()
ax.tick_params(axis='x', labelrotation = 45)
plt.show()

This also results in:

rotate x axis labels with tick_params

Rotate Y-Axis Labels in Matplotlib

The exact same steps can be applied for the Y-Axis labels.

Firstly, you can change it on the Figure-level with plt.yticks(), or on the Axes-lebel by using tick.set_rotation() or by manipulating the ax.set_yticklabels() and ax.tick_params().

Let's start off with the first option:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)
plt.yticks(rotation = 45)
plt.show()

Sme as last time, this sets the rotation of yticks by 45-degrees:

rotate y axis labels yticks

Now, let's work directly with the Axes object:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)

ax = plt.gca()
plt.draw()

ax.set_yticklabels(ax.get_yticks(), rotation = 45)

plt.show()

The same note applies here, you have to call plt.draw() before this call to make it work correctly.

rotate y axis labels with yticklabels

Now, let's iterate over the list of ticks and set_rotation() on each of them:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)

ax = plt.gca()
plt.draw()

for tick in ax.get_yticklabels():
    tick.set_rotation(45)
plt.show()

This also results in:

rotate y axis labels with set_rotation

And finally, you can use the ax.tick_params() function and set the label rotation there:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)

ax = plt.gca()
ax.tick_params(axis='y', labelrotation = 45)
plt.show()

This also results in:

rotate y axis labels with tick_params

Rotate Dates to Fit in Matplotlib

Most often, the reason people rotate ticks in their plots is because they contain dates. Dates can get long, and even with a small dataset, they'll start overlapping and will quickly become unreadable.

Of course, you can rotate them like we did before, usually, a 45-degree tilt will solve most of the problems, while a 90-degree tilt will free up even more.

Though, there's another option for rotating and fixing dates in Matplotlib, which is even easier than the previous methods - fig.autofmt__date().

This function can be used either as fig.autofmt_xdate() or fig.autofmt_ydate() for the two different axes.

Let's take a look at how we can use it on the Seattle Weather Dataset:

import pandas as pd
import matplotlib.pyplot as plt

weather_data = pd.read_csv("seattleWeather.csv")

fig = plt.figure()
plt.plot(weather_data['DATE'], weather_data['PRCP'])
fig.autofmt_xdate()
plt.show()

This results in:

auto format dates to fit in matplotlib

Conclusion

In this tutorial, we've gone over several ways to rotate Axis text/labels in a Matplotlib plot, including a specific way to format and fit dates .

If you're interested in Data Visualization and don't know where to start, make sure to check out our book on Data Visualization in Python.

Data Visualization in Python, a book for beginner to intermediate Python developers, will guide you through simple data manipulation with Pandas, cover core plotting libraries like Matplotlib and Seaborn, and show you how to take advantage of declarative and experimental libraries like Altair.

Data Visualization in Python

Understand your data better with visualizations! With over 275+ pages, you'll learn the ins and outs of visualizing data in Python with popular libraries like Matplotlib, Seaborn, Bokeh, and more.

24 Nov 2020 3:29pm GMT

Real Python: Formatting Python Strings

In this course, you'll see two items to add to your Python string formatting toolkit. You'll learn about Python's string format method and the formatted string literal, or f-string. You'll learn about these formatting techniques in detail and add them to your Python string formatting toolkit.

In this course, you'll learn about:

  1. The string .format() method
  2. The formatted string literal, or f-string

[ Improve Your Python With 🐍 Python Tricks 💌 - Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

24 Nov 2020 2:00pm GMT

feedPlanet Lisp

Michał Herda: Goodbye, Hexstream

I am saddened that I need to write this post, but I need to make a public confession.

After Jean-Philippe Paradis, a Common Lisp programmer better known online as Hexstream, requested me to review his "extensive contributions" to the Common Lisp ecosystem, he seems to have disliked my reply so much that he has declared me the single biggest threat to the Common Lisp community right now.

(A gist copy of the review is here for people who would rather avoid browsing the full issue.)

The review has appeared after yet another discussion thread on GitHub - originally about implementations of Clojurelike arrow macros in Common Lisp - has been derailed by Hexstream in the traditional way in which he derailed many [1] other[2] GitHub[3] discussions[4]: asserting as a logical fact that his preferences take precedence over other people's preferences, aggressively calling out other people for questioning this state of matters, and finally playing the victim card of being silenced, censored, and tortured by a so-called Common Lisp Mafia.

Unlike during the past few times, this time I have decided not to give up posting. On the contrary, I have spend a considerable amount of my personal time (including one all-nighter) to actually respond to every single post of Hexstream, analyze it, take it apart into individual claims that he is making, and refute every single false point that I could find to the best of my ability using the full extent of my available tools.

After several posts of increasing angriness exchanged with Hexstream, in which discussion I have once again tried to coerce him into changing his course and stop being an aggressive offender towards members of the Common Lisp community, and after being explicitly invited to analyze Hexstream's contribution to the Common Lisp community in a tweet of his, I replied to his request with an analysis of the public data collected from GitHub, Quicklisp and Hexstream's public CV. Hexstream has announced multiple times that he is proud of this information and there is nothing to hide there; no, quite the contrary. Hence, I felt welcome to use it and see for myself what kinds of prominent contributions of his I must have missed.

It seems that my analysis of that data was not well-received; Hexstream disappeared with a mere "see you in 2021" comment, stating that he has projects with higher priorities to work on at the moment, and simply replied on GitHub that "my posts contain countless factual, logical and other errors". Afterwards, his Twitter contained this.

I did have a fair amount of respect left for phoe before today, but after he said I am not a Common Lisp expert and that I am a fraud, based on malicious deliberately superficial

Welp.

            (with-irony "

          

It seems to me that I must have thought the unthinkable. (How could I have said that he is not a Common Lisp expert and a fraud? How was it even possible!?) Moreover, I then dared to say it aloud. Worst of all, I even backed it all with solid, concrete, data-based evidence that cannot be immediately refuted as a mere opinion and requires some serious figuring out of how to turn it around so that the Common Lisp Mafia is guilty for all the facts that I've noted.

All of a sudden, after posting this single post, I have become the main threat to the whole Common Lisp community, declared impossible to directly and indirectly fund in an ethical manner, and then proclaimed to require immediate medical attention of psychiatric nature.

Oh goodness. I assume that the analysis must have been way too short for his liking. I regret that I have not found the time to go into his GitHub issues in detail...

            ")

          

So, Hexstream. If you're reading this, I hope that my review serves as a proper wake-up call for you to actually see that your behavior is off and needs adjustment in order for other people to actually consider you acceptable in the Common Lisp community. If it does not, I have done everything to actually try and help you as a fellow Common Lisp hacker. I can, and will, do no more in this matter, and will instead do everything to protect the people I respect, like, and cooperate with from your destructive influence.

You are planning to launch some kind of Common Lisp Revival 2020 Fundraiser soon. I would like to tell you that I consider you to be the wrong person to launch one: not even for any of the aforementioned reasons, but for the reason that to you, Common Lisp seems to be a completely different language than it is to me. Based on the above review that you have requested me to do, it seems that you perceive Common Lisp as a strictly single-player language where you have to struggle against countless feats and enemies on Twitter, GitHub, and wherever else, in order to produce anything of even the smallest value after grand feats and massive effort to struggle against censorship.

On the contrary, I know many people who consider Common Lisp to be a multiplayer language where people support one another, are eager to help each other, share knowledge, indulge in fascinating projects that would be tough to indulge in with other languages and, best of all, are not hostile towards one another at the smallest hint of suspicion. Some of those people form the Common Lisp Foundation that, in my opinion, should take over any kind of Common Lisp revival fundraisers.

Obviously, all other reasons from my analysis why you are not entitled to represent the Common Lisp community as head of such a fundraiser still apply. And they are much more damning than the worldview issue above.

  • Your claimed commercial expertise in Common Lisp is void.
  • Your fifteen years of overall experience in Lisp have no basis in actual code.
  • Your projects larger than micro-utilities have been so poor that, as you claim, you have disposed of them yourself.
  • Your micro-utilities do not have a single dependent in the main Quicklisp distribution and they do not show signs of actual use by programmers.
  • Your documentation projects are generally not acceptable in the Common Lisp community because they are encumbered by the implicit unbearable personality of their author.
  • You have not contributed a single line of code to any GitHub repository hosted by anyone else throughout your eight and a half years of presence on GitHub and fifteen years of overall programming experience that you claim to have.
  • You derail GitHub conversations with offensive and aggressive comments, indulge in Twitter rants containing more offensive and aggressive comments, and tie them together with your personal website containing even more offensive and aggressive comments.
  • You repeatedly defame various honored and respected members of the Common Lisp community, including Rainer Joswig, Michael Fiano, Daniel Kochmański, Stas Boukarev, and Zach Beane. And, I guess, me.
  • Oh, about Zach! have I mentioned https://xach.exposed ?

And to top it all, after the above analysis was posted, instead of fulfilling my hopes and responding to this critique of your Lisp merit by indulging in meritocratic discussion about your technical contributions to the Common Lisp ecosystem, you instead immediately announced that I require psychiatric help.

For completeness, I do have to admit: you have been popularizing crowdfunding among Lispers and achieved visible success there, with multiple authors and repositories adopting various means of crowdfunding (GitHub Sponsors, Patreon, LiberaPay) thanks to your efforts and suggestions. This is the one single thing that I can unambiguously consider a net positive coming from you. That's all.

Other than that, I do have to repeat what I have said at the end of my analysis. You try to pose as a Common Lisp expert. No, with all of the above I have no reasons to claim that you are one. Your expertise is hollow. Your experience seems false. You pretend to be someone you are not. You are a scam, Hexstream, and I am saddened and torn that I need to speak these words because I sincerely wish you were not.


The earliest Lisp commit that I was able to find in my GitHub repositories is from November 2015. That is exactly five years ago. In 2015, I was getting frustrated over Emacs keybindings. In 2015, you were "exposing" Zach Beane. Through these five years, I was learning Lisp to the best of my ability. Through these five years, you were doing I have no idea what. I can only guess based on what I see.

And I see Twitter rants. I see GitHub issue derailments. I see self-announced policies that contradict one another. I see tiny Lisp libraries with zero users. I see no other Lisp code of yours. I see no code of yours in any other GitHub repositories. I see big claims backed by nothing. I see an image of a Common Lisp expert that is so fragile that it falls into pieces after a brief glance.

Seriously, what were you doing with your life during these years? Researching ethics? Verifying the boundaries set by Twitter and GitHub moderation teams? Fighting for your life while the Common Lisp Mafia caged you and demanded a ransom of 20,000,000 US parentheses for your freedom?

I simply cannot comprehend it. And I do feel sorry for you, since most likely neither can you.


If you are still reading, please answer one question that I will ask at the end of this block of text. I will attempt to be somewhat honest regarding myself in the topic of my own impact on the Common Lisp community, as I see it. No boasting too much, not being too humble. Let's try it.

I have attempted to complete the Common Lisp UltraSpec which I talked about at an European Lisp Symposium one time and then failed miserably at this task after grossly misestimating it. I have implemented package-local nicknames in Clozure Common Lisp and then used the momentum from that work to make a portability library for package-local nicknames. I have managed to rewrite and optimize the somewhat famed split-sequence system commonly used in the Common Lisp ecosystem. I have managed to overhaul the even more famed Lisp Koans by rewriting them almost from scratch and fixing multiple compliance errors. I have successfully convinced Massachusetts Institute of Technology to release the Common Lisp WordNet interface under a permissive license (which took only half a year of pinging people via mail) and fixed it up as appropriate. I have written a utility suite for managing protocols and test cases with some documentation that I am proud of even after two whole years. I wrote an implementation of Petri nets in Common Lisp that seems either to work fine or not to be used at all, because I do not get much attention from it; still, I've tested it (hopefully) well enough to be useful in the general case. I recently wrote the fastest priority queue available in Common Lisp after someone mentioned that the ones on Quicklisp are too slow. I then ended up miserably failing at rewriting the Common Lisp arrows system, which resulted in a different system with a tutorial for arrows that I have received several thanks for. And then there's some smaller libraries that might not be all that mentionworthy.

I have been hosting the Online Lisp Meeting series which have met general acclaim and popularity and are considered a worthy continuous extension of the ideas of the European Lisp Symposium - even if, in my opinion, they contain a bit too much CL content, compared to the ELS ideals and statistics. The eleventh installment is bound to happen this week, where I will speak for the second time - again about the topic of control flow and condition systems. I already have two more talks queued up and we plan on going until the next European Lisp Symposium, which will most likely eat up all of the available talks and then some. (Maybe some of the rejected papers will sublimate as OLM videos though?... I sincerely hope so! ELS recently had to reject papers not because they were bad, but because they had an already full schedule.)

With help of countless people helping me on various stages of the book lifecycle and with support from Apress Publishing, I have managed to release the book The Common Lisp Condition System along with a pair of accompanying Common Lisp systems, the larger Portable Condition System and the smaller trivial-custom-debugger, plus a release of source code from the book and a free online appendix to the book containing content that did not make it inside in time. I have also proven that the condition system can be easily implemented in a non-Lisp, which is Java, and I will talk about this in extent to the WebAssembly committee to ensure that WASM has all the necessary functionalities to ensure that Common Lisp can be efficiently implemented on WASM.

Finally, I made some art once. I think it did not sting anybody's eyes too hard. Or that it's strictly Lisp-related too much... but hey, it's CL implementations, and the Lisp Lizard.

I think I am generally tolerated and maybe even enjoyed in my community as a Common Lisp programmer, despite my occasional outbursts of frustration and outright stupidity. I try to be available on Reddit, IRC, Discord, and in private messages for all sorts of support that I am capable of providing. I try to teach other people the way I was taught when I was starting out. Whenever I notice that I should apologize and make amends because I fucked up somewhere (e.g. in the recent Quickdocs issue), I do try my best to be sorry and amend my behavior as appropriate, and I try to welcome other people's remarks and inegrate them into my behavior as appropriate; I think it helps other people tolerate my behavior when I'm not easily tolerable.

And, well, you know, there is this single person in my environment who just keeps on smearing shit on people in my vicinity, but I don't think I care anymore; this person has willingly made so many enemies by now, that they are ignored by many, confronted by few (who actually have some time to spare), and hell, I even got some most unexpected people to cheer me on in my attempts to actually try and confront this guy and his bullshit excuses for repeatedly setting fires in the Common Lisp world.

But, yeah, anyway. You still consider that it's me who needs psychiatric help. Is that right?


So, Hexstream, this is a goodbye. Thank you for the unique chance to train my patience, persistence, and insistence. I assure you that it has not gone to waste, and I assure you that I will remember it for the rest of my life.

Since you do not seem to want to change your behavior in the slightest, then I wish you to stay on your current course and not change in the slightest so you may see for yourself where it leads you. The faster you slide into irrelevance because of your current choice, the healthier the Common Lisp community will be.

(And I mean the real Common Lisp community, containing more than just a single person who's purely accidentally named Jean-Philippe.)

Bye. I don't think I will miss you much, even though I adore the technical thought behind some of your libraries. And if I encounter you again on the Internet, be prepared to once again meet the side of me that has long run out of spare chances to give you anymore.

24 Nov 2020 1:24pm GMT

23 Nov 2020

feedPlanet Debian

Shirish Agarwal: White Hat Senior and Education

I had been thinking of doing a blog post on RCEP which China signed with 14 countries a week and a day back but this new story has broken and is being viraled a bit on the interwebs, especially twitter and is pretty much in our domain so thought would be better to do a blog post about it. Also there is quite a lot packed so quite a bit of unpacking to do.

Whitehat, Greyhat and Blackhat

For those of you who may not, there are actually three terms especially in computer science that one comes across. Those are white hats, grey hats and black hats. Now clinically white hats are like the fiery angels or the good guys who basically take permissions to try and find out weakness in an application, program, website, organization and so on and so forth. A somewhat dated reference to hacker could be Sandra Bullock (The Net 1995) , Sneakers (1992), Live Free or Die Hard (2007) . Of the three one could argue that Sandra was actually into viruses which are part of computer security but still she showed some bad-ass skills, but then that is what actors are paid to do 🙂 Sneakers was much more interesting for me because in that you got the best key which can unlock any lock, something like quantum computing is supposed to do. One could equate both the first movies in either white hat or grey hat . A grey hat is more flexible in his/her moral values and they are plenty of such people. For e.g. Julius Assange could be described as a grey hat, but as you can see and understand those are moral issues.

A black hat on the other hand is one who does things for profit even if it harms the others. The easiest fictitious examples are all Die Hard series, all of them except the 4th one, all had bad guys or black hats. The 4th one is the odd one out as it had Matthew Farell (Justin Long) as a grey hat hacker. In real life Kevin Mitnick, Kevin Poulsen, Robert Tappan Morris, George Hotz, Gary McKinnon are some examples of hackers, most of whom were black hats, most of them reformed into white hats and security specialists. There are many other groups and names but that perhaps is best for another day altogether.

Now why am I sharing this. Because in all of the above, the people who are using and working with the systems have better than average understanding of systems and they arguably would be better than most people at securing their networks, systems etc. but as we shall see in this case there has been lots of issues in the company.

WhiteHat Jr. and 300 Million Dollars

Before I start this, I would like to share that for me this suit in many ways seems to be similar to the suit filed against Krishnaraj Rao . Although the difference is that Krishnaraj Rao's case/suit is that it was in real estate while this one is in 'education' although many things are similar to those cases but also differ in some obvious ways. For e.g. in the suit against Krishnaraj Rao, the plaintiff's first approached the High Court and then the Supreme Court. Of course Krishanraj Rao won in the High Court and then in the SC plaintiff's agreed to Krishnaraj Rao's demands as they knew they could not win in SC. In that case, a compromise was reached by the plaintiff just before judgement was to be delivered.

In this case, the plaintiff have directly come to the SC, short-circuiting the high court process. This seems to be a new trend with the current Government in power where the rich get to be in SC without having to go the Honorable HC . It says much about SC as well, as they entertained the application and didn't ask the plaintiff to go to the lower court first as should have been the case but that is and was honorable SC's right to decide . The charges against Pradeep Poonia (the defendant in this case) are very much similar to those which were made in Krishanraj Rao's suit hence won't be going into those details. They have claimed defamation and filed a 20 crore suit. The idea is basically to silence any whistle-blowers.

Fictional Character Wolf Gupta

The first issue in this case or perhaps one of the most famous or infamous character is an unknown. While he has been reportedly hired by Google India, BJYU, Chandigarh. This has been reported by Yahoo News. I did a cursory search on LinkedIn to see if there indeed is a wolf gupta but wasn't able to find any person with such a name. I am not even talking the amount of money/salary the fictitious gentleman is supposed to have got and the various variations on the salary figures at different times and the different ads.

If I wanted to, I could have asked few of the kind souls whom I know are working in Google to see if they can find such a person using their own credentials but it probably would have been a waste of time. When you show a LinkedIn profile in your social media, it should come up in the results, in this case it doesn't. I also tried to find out if somehow BJYU was a partner to Google and came up empty there as well. There is another story done by Kan India but as I'm not a subscriber, I don't know what they have written but the beginning of the story itself does not bode well.

While I can understand marketing, there is a line between marketing something and being misleading. At least to me, all of the references shared seems misleading at least to me.

Taking down dissent

One of the big no-nos at least from what I perceive, you cannot and should not take down dissent or critique. Indians, like most people elsewhere around the world, critique and criticize day and night. Social media like twitter, mastodon and many others would not exist in the place if criticisms are not there. In fact, one could argue that Twitter and most social media is used to drive engagements to a person, brand etc. It is even an official policy in Twitter. Now you can't drive engagements without also being open to critique and this is true of all the web, including WordPress and me 🙂 . What has been happening is that whitehatjr with help of bjyu have been taking out content of people citing copyright violation which seems laughable.

When citizens critique anything, we are obviously going to take the name of the product otherwise people would have to start using new names similar to how Tom Riddle was known as 'Dark Lord' , 'Voldemort' and 'He who shall not be named' . There have been quite a few takedowns, I just provide one for reference, the rest of the takedowns would probably come in the ongoing suit/case.

Whitehat Jr. ad showing investors fighting


Now a brief synopsis of what the ad. is about. The ad is about a kid named 'Chintu' who makes an app. The app. is so good that investors come to his house and right in the lawn and start fighting each other. The parents are enjoying looking at the fight and to add to the whole thing there is also a nosy neighbor who has his own observations. Simply speaking, it is a juvenile ad but it works as most parents in India, as elsewhere are insecure.

Jihan critiquing the whitehatjr ad

Before starting, let me assure that I asked Jihan's parents if it's ok to share his ad on my blog and they agreed. What he has done is broken down the ad and showed how juvenile the ad is and using logic and humor as a template for the same. He does make sure to state that he does not know how the product is as he hasn't used it. His critique was about the ad and not the product as he hasn't used that.

The Website

If you look at the website, sadly, most of the site only talks about itself rather than giving examples that people can look in detail. For e.g. they say they have few apps. on Google play-store but no link to confirm the same. The same is true of quite a few other things. In another ad a Paralympic star says don't get into sports and get into coding. Which athlete in their right mind would say that. And it isn't that we (India) are brimming with athletes at the international level. In the last outing which was had in 2016, India sent a stunning 117 athletes but that was an exception as we had the women's hockey squad which was of 16 women, and even then they were overshadowed in numbers by the bureaucratic and support staff. There was criticism about the staff bit but that is probably a story for another date.

Most of the site doesn't really give much value and the point seems to be driving sales to their courses. This is pressurizing small kids as well as teenagers and better who are in the second and third year science-engineering whose parents don't get that it is advertising and it is fake and think that their kids are incompetent. So this pressurizes both small kids as well as those who are learning, doing in whatever college or educational institution . The teenagers more often than not are unable to tell/share with them that this is advertising and fake. Also most of us have been on a a good diet of ads. Fair and lovely still sells even though we know it doesn't work.

This does remind me of a similar fake academy which used very much similar symptoms and now nobody remembers them today. There used to be an academy called Wings Academy or some similar name. They used to advertise that you come to us and we will make you into a pilot or an airhostess and it was only much later that it was found out that most kids were doing laundry work in hotels and other such work. Many had taken loans, went bankrupt and even committed suicide because they were unable to pay off the loans due to the dreams given by the company and the harsh realities that awaited them. They were sued in court but dunno what happened but soon they were off the radar so we never came to know what happened to those million of kids whose life dreams were shattered.

Security

Now comes the security part. They have alleged that Pradeep Poonia broke into their systems. While this may be true, what I find funny is that with the name whitehat, how can they justify it. If you are saying you are white hat you are supposed to be much better than this. And while I have not tried to penetrate their systems, I did find it laughable that the site is using an expired https:// certificate. I could have tried further to figure out the systems but I chose not to . How they could not have an automated script to do the same is beyond me. But that is their concern, not mine.

Comparison

A similar offering would be unacademy but as can be seen they neither try to push you in anyway and nor do they make any ridiculous claims. In fact how genuine unacademy is can be gauged from the fact that many of its learning resources are available to people to see on YT and if they have tools they can also download it. Now, does this mean that every educational website should have their content for free, of course not. But when a channel has 80% - 90% of it YT content as ads and testimonials then they surely should give a reason to pause both for parents and students alike. But if parents had done that much research, then things would not be where they are now.

Allegations

Just to complete, there are allegations by Pradeep Poornia with some screenshots which show the company has been doing lot of bad things. For e.g. they were harassing an employee at night 2 a.m. who was frustrated and working in the company at the time. Many of the company staff routinely made sexist and offensive, sexual abusive remarks privately between themselves for prospective women who came to interview via webcam (due to the pandemic). There also seems to be a bit of porn on the web/mobile server of the company as well. There also have been allegations that while the company says refund is done next day, many parents who have demanded those refunds have not got it. Now while Pradeep has shared some of the quotations of the staff while hiding the identities of both the victims and the perpetrators, the language being used in itself tells a lot. I am in two minds whether to share those photos or not hence atm choosing not to. Poornia has also contended that all teachers do not know programming and they are given scripts to share. There have been some people who did share that experience with him -

Suruchi Sethi

From the company's side they are alleging he has hacked the company servers and would probably be using the Fruit of the poisonous tree argument which we have seen have been used in many arguments.

Conclusion

Now that lies in the eyes of the Court whether the single bench choses the literal meaning or use the spirit of the law or the genuine concerns of the people concerned. While in today's hearing while the company asked for a complete sweeping injunction they were unable to get it. Whatever may happen, we may hope to see some fireworks in the second hearing which is slated to be on 6.01.2021 where all of this plays out. Till later.

23 Nov 2020 7:22pm GMT

Vincent Fourmond: QSoas tips and tricks: using meta-data, first level

By essence, QSoas works with \(y = f(x)\) datasets. However, in practice, when working with experimental data (or data generated from simulations), one has often more than one experimental parameter (\(x\)). For instance, one could record series of spectra (\(A = f(\lambda)\)) for different pH values, so that the absorbance is in fact a function of both the pH and \(\lambda\). QSoas has different ways to deal with such situations, and we'll describe one today, using meta-data.

Setting meta-data

Meta-data are simply series of name/values attached to a dataset. It can be numbers, dates or just text. Some of these are automatically detected from certain type of data files (but that is the topic for another day). The simplest way to set meta-data is to use the set-meta command:

QSoas> set-meta pH 7.5

This command sets the meta-data pH to the value 7.5. Keep in mind that QSoas does not know anything about the meaning of the meta-data[1]. It can keep track of the meta-data you give, and manipulate them, but it will not interpret them for you. You can set several meta-data by repeating calls to set-meta, and you can display the meta-data attached to a dataset using the command show. Here is an example:

QSoas> generate-buffer 0 10
QSoas> set-meta pH 7.5
QSoas> set-meta sample "My sample"
QSoas> show 0
Dataset generated.dat: 2 cols, 1000 rows, 1 segments, #0
Flags: 
Meta-data:      pH =     7.5    sample =         My sample

Note here the use of quotes around My sample since there is a space inside the value.

Using meta-data

There are many ways to use meta-data in QSoas. In this post, we will discuss just one: using meta-data in the output file. The output file can collect data from several commands, like peak data, statistics and so on. For instance, each time the command 1 is run, a line with the information about the largest peak of the current dataset is written to the output file. It is possible to automatically add meta-data to those lines by using the /meta= option of the output command. Just listing the names of the meta-data will add them to each line of the output file. As a full example, we'll see how one can take advantage of meta-data to determine the position of the peak of the function \(x^2 \exp (-a\,x)\) depends on \(a\). For that, we first create a script that generates the function for a certain value of \(a\), sets the meta-data a to the corresponding value, and find the peak. Let's call this file do-one.cmds (all the script files can be found in the GitHub repository):

generate-buffer 0 20 x**2*exp(-x*${1})
set-meta a ${1}
1 

This script takes a single argument, the value of \(a\), generates the appropriate dataset, sets the meta-data a and writes the data about the largest (and only in this case) peak to the output file. Let's now run this script with 1 as an argument:

QSoas> @ do-one.cmds 1

This command generates a file out.dat containing the following data:

## buffer       what    x       y       index   width   left_width      right_width     area
generated.dat   max     2.002002002     0.541340590883  100     3.4034034034    1.24124124124   2.162162162161.99999908761

This gives various information about the peak found: the name of the dataset it was found in, whether it's a maximum or minimum, the x and y positions of the peak, the index in the file, the widths of the peak and its area. We are interested here mainly in the x position. Then, we just run this script for several values of \(a\) using run-for-each, and in particular the option /range-type=lin that makes it interpret values like 0.5..5:80 as 80 values evenly spread between 0.5 and 5. The script is called run-all.cmds:

output peaks.dat /overwrite=true /meta=a
run-for-each do-one.cmds /range-type=lin 0.5..5:80
V all /style=red-to-blue

The first line sets up the output to the output file peaks.dat. The option /meta=a makes sure the meta a is added to each line of the output file, and /overwrite=true make sure the file is overwritten just before the first data is written to it, in order to avoid accumulating the results of different runs of the script. The last line just displays all the curves with a color gradient. It looks like this:

Running this script (with @ run-all.cmds) creates a new file peaks.dat, whose first line looks like this:

## buffer       what    x       y       index   width   left_width      right_width     area    a

The column x (the 3rd) contains the position of the peaks, and the column a (the 10th) contains the meta a (this column wasn't present in the output we described above, because we had not used yet the output /meta=a command). Therefore, to load the peak position as a function of a, one has just to run:

QSoas> load peaks.dat /columns=10,3

This looks like this:

Et voilà ! To train further, you can:

[1] this is not exactly true. For instance, some commands like unwrap interpret the sr meta-data as a voltammetric scan rate if it is present. But this is the exception.

About QSoas

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050-5052. Current version is 2.2. You can download its source code there (or clone from the GitHub repository) and compile it yourself, or buy precompiled versions for MacOS and Windows there.

23 Nov 2020 6:55pm GMT

feedPlanet Lisp

Vsevolod Dyomkin: The Common Lisp Condition System Book

Several months ago I had a pleasure to be one of the reviewers of the book The Common Lisp Condition System (Beyond Exception Handling with Control Flow Mechanisms) by Michał Herda. I doubt that I have contributed much to the book, but, at least, I can express my appreciation in the form of a reader review here.

My overall impression is that the book is very well-written and definitely worth reading. I always considered special variables, the condition system, and multiple returns values to be the most underappreciated features of Common Lisp, although I have never imagined that a whole book may be written on these topics (and even just two of them). So, I was pleasantly flabbergasted.

The book has a lot of things I value in good technical writing: a structured and logical exposition, detailed discussions of various nuances, a subtle sense of humor, and lots of Lisp. I should say that reading the stories of Tom, Kate, and Mark was so entertaining that I wished to learn more about their lives. I even daydreamt (to use the term often seen throughout the book) about a new semi-fiction genre: stories about people who behave like computer programs. I guess a book of short stories containing the two from this book and the story of Mac from "Practical Common Lisp" can already be initialized. "Anthropomorphic Lisp Tales"...

So, I can definitely recommend reading CLCS to anyone interested in expanding their Lisp knowledge and general understanding of programming concepts. And although I can call myself quite well versed with the CL condition system, I was also able to learn several new tricks and enrich my understanding. Actually, that is quite valuable as you never know when one of its features could become handy to save your programming day. In my own Lisp career, I had several such a-ha moments and continue appreciating them.

This book should also be relevant to those, who have a general understanding of Lisp, but are compelled to spend their careers programming in inferior languages: you can learn more about one of the foundations of interactive programming and appreciate its value. Perhaps, one day you'll have access to programming environments that focus on this dimension or you'll be able to add elements of interactivity to your own workflow.

As for those who are not familiar with Lisp, I'd first start with the classic Practical Common Lisp.

So, thanks to Michał for another great addition to my virtual Lisp books collection. The spice mush flow, as they say...

23 Nov 2020 12:41pm GMT

feedPlanet Grep

Lionel Dricot: Pour un logiciel de correspondance plutôt qu’un client mail

Plaidoyer en faveur d'un logiciel de relations épistolaires électroniques, échanges sacrifiés au culte de l'instantanéité.

J'aime l'email. Je ne me lasse pas de m'émerveiller sur la beauté de ce système qui nous permet d'échanger par écrit, de manière décentralisée. D'entretenir des relations épistolaires dématérialisées à l'abri des regards (si l'on choisit bien son fournisseur). Je l'ai déjà dit et le redis.

https://ploum.net/email-mon-amour/.

Pourtant, l'indispensable email est régulièrement regardé de haut. Personne n'aime l'email. Il est technique, laborieux. Il est encombré de messages. Alors toute nouvelle plateforme nous attire, nous donne l'impression de pouvoir communiquer plus simplement qu'avec l'email.

Beaucoup trop d'utilisateurs sont noyés dans leurs emails. Ils postposent une réponse avant que celle-ci ne soit noyée dans un flux incessant de sollicitation. Entrainant, effet pervers, une insistance de l'expéditeur.

Désabusé, la tentation est grande de se tourner vers cette nouvelle plateforme aguichante. Tout semble plus simple. Il y'a moins de messages, ils sont plus clairs. La raison est toute simple : la plateforme est nouvelle, les échanges entre les utilisateurs sont peu nombreux. Dès le moment où cette plateforme sera devenue particulièrement populaire, votre boîte à messages se retrouvera noyée tout comme votre boîte à email. Tout au plus certaines plateformes s'évertuent à transformer vos boîtes en flux, de manière à vous retirer de la culpabilité, mais entrainant une perte d'informations encore plus importante.

https://ploum.net/comment-jai-fui-le-flux-pour-retrouver-ma-boite/

C'est pour cela que l'email est magnifique. Après des décennies, il est toujours aussi utile, aussi indispensable. Nous pouvons imaginer un futur sans Google, un futur sans Facebook. Mais un futur sans email ?

L'email pourrait être merveilleux. Mais aucun client mail ne donne envie d'écrire des mails.

Je rêve d'un client mail qui serait un véritable logiciel d'écriture. Pas d'options et de fioriture. Pas de code HTML. Écrire un email comme on écrit une lettre. En mettant l'adresse du destinataire en dernier, comme on le fait pour une enveloppe.

Un logiciel d'écriture d'email qui nous aiderait à retrouver un contact avec sa correspondance plutôt qu'à permettre l'accomplissement d'une tâche mécanique. Un logiciel qui nous encouragerait à nous désabonner de tout ce qui n'est pas sollicité, qui marquerait des mails les correspondances en attente d'une réponse. Qui nous encouragerait à archiver un mail où à le marquer comme nécessitant une action plutôt qu'à le laisser moisir dans notre boîte aux lettres.

Bref, je rêve d'un client mail qui me redonne le plaisir d'interagir avec des personnes, pas avec des fils de discussions ou des onomatopées.

D'un autre côté, j'abhorre ces tentatives de classement automatique qui fleurissent, par exemple sur Gmail. Outre qu'elles augmentent le pouvoir de ces algorithmes, elles ne font que cacher le problème sans tenter d'y remédier. Si les mails doivent être triés comme « promotions » ou « notifications », c'est la plupart du temps que je n'avais pas besoin de les voir en premier lieu. Que ces emails n'auraient jamais dû être envoyés.

Enfin, un véritable logiciel de correspondance devrait abandonner cette notion de notification et de temps réel. Une fois par jour, comme le passage du facteur, les courriels seraient relevés, m'indiquant clairement mes interactions pour la journée.

De même, mes mails rédigés ne seraient pas envoyés avant une heure fixe du soir, me permettant de les modifier, de les corriger. Mieux, je devrais être forcé de passer en revue ce que 'envoie, comme si je me rendais au bureau de poste.

En poussant le bouchon un peu plus loin, les mails envoyés pourraient prendre une durée aléatoire pour être remis. Un lecteur de mon blog a même imaginé que cette durée soit proportionnelle à la distance, comme si le courriel était remis à pied, à cheval ou en bateau.

Car l'immédiateté nous condamne à la solitude. Si un mail est envoyé, une réponse reçue instantanément, l'ubiquité du smartphone nous oblige presque à répondre immédiatement. Cela même au milieu d'un magasin ou d'une activité, sous peine d'oublier et de penser paraitre grossier.

La réponse à la réponse sera elle aussi immédiate et la conversation s'achèvera, les protagonistes comprenant que ce ping-pong en temps réel ne peut pas durer plus de quelques mots.

Paradoxalement, en créant l'email, nous avons détruit une fonctionnalité majeure des relations épistolaires : la possibilité pour chacune des parties de répondre quand l'envie lui prend et quand elle est disponible.

Jusqu'au 20e siècle, personne ne s'étonnait de ne pas recevoir de réponse à sa lettre pendant plusieurs jours voire pendant des semaines. Écrire une lettre de relance était donc un investissement en soi : il fallait se souvenir, garder l'envie et prendre le temps de le faire.

Cette temporisation a permis une explosion de la créativité et de la connaissance. De grands pans de l'histoire nous sont accessibles grâce aux relations épistolaires de l'époque. De nombreuses idées ont germé lors d'échanges de lettres. Pouvez-vous imaginer le 21e siècle vu par les yeux des historiens du futur à travers nos emails ?

Une lettre était lue, relue. Elle plantait une graine chez le destinataire qui méditait avant de prendre sa plume, parfois après plusieurs brouillons, pour rédiger une réponse.

Une réponse qui n'était pas paragraphe par paragraphe, mais bien une lettre à part entière. Une réponse rédigée en partant du principe que le lecteur ne se souvenait plus nécessairement des détails de la lettre initiale. Aujourd'hui, l'email nous sert à essentiellement à « organiser un call » pour discuter d'un sujet sur lequel personne n'a pris le temps de réfléchir.

Des parties d'échecs historiques se sont déroulées sur plusieurs années par lettres interposées. Pourrait-on imaginer la même chose avec l'email ? Difficilement. Les échecs se jouent désormais majoritairement en ligne en temps réel.

Pourtant, le protocole le permet. Il s'agit simplement d'un choix des concepteurs de logiciel d'avoir voulu mettre l'accent sur la rapidité, l'immédiateté, l'efficacité et la quantité.

Il ne faudrait pas grand-chose pour remettre au centre des échanges écrits la qualité dont nous avons cruellement besoin.

Nous utilisons le mail pour nous déresponsabiliser. Il y'a une action à faire, mais en répondant à l'email, je passe la patate chaude à quelqu'un d'autre. Répondre le plus rapidement, si possible avec une question, pour déférer le moment où quelqu'un devra prendre une décision. Tout cela au milieu d'un invraisemblable bruit publicitaire robotisé. Nous n'échangeons plus avec des humains, nous sommes noyés par le bruit des robots tout en tentant d'échanger avec des agents administratifs anonymes. Nous n'avons plus le temps de lire ni d'écrire, mais nous croyons avoir la pertinence de prendre des décisions rapides. Nous confondons, avec des conséquences dramatiques, efficience et rapidité.

Pour l'interaction humaine, nous nous sommes alors rabattus sur les chats. Leur format nous faisait penser à une conversation, leur conception nous empêche de gérer autrement qu'en répondant immédiatement.

Ce faisant, nous avons implicitement réduit l'interaction humaine à un échange court, bref, immédiat. Une brièveté et une rapidité émotive qui nous pousse à agrémenter chaque information d'un succédané d'émotion : l'émoji.

Nous en oublions la possibilité d'avoir des échanges lents, profonds, réfléchis.

Parfois, je rêve d'abandonner les clients mails et les messageries pour un véritable client de correspondances. De sortir de l'immédiateté du chat et de la froideur administrative du mail pour retrouver le plaisir des relations épistolaires.

Photo by Liam Truong on Unsplash

Je suis @ploum, ingénieur écrivain. Abonnez-vous pour recevoir mes billets par mail ou RSS, partagez mes écrits autour de vous autour de vous et soutenez-moi en achetant mes livres. Commandez Printeurs, mon dernier roman de science-fiction..

Vérifiez votre boite de réception ou votre répertoire d'indésirables pour confirmer votre abonnement.

Ce texte est publié sous la licence CC-By BE.

23 Nov 2020 11:11am GMT

Mattias Geniar: wabt-sys compile: error: CMAKE_PROJECT_VERSION was not declared in this scope on Ubuntu 18.04 LTS

When trying to compile WebAssembly (wabt-rs) for a dependent package, it failed on Ubuntu 18.04 LTS due to this error message.

23 Nov 2020 12:00am GMT

Mattias Geniar: cargo: no such subcommand: +nightly

I was trying to run the following command: $ cargo +nightly install [package] error: no such subcommand: `+nightly` But … that didn't work.

23 Nov 2020 12:00am GMT

22 Nov 2020

feedPlanet Debian

François Marier: Removing a corrupted data pack in a Restic backup

I recently ran into a corrupted data pack in a Restic backup on my GnuBee. It led to consistent failures during the prune operation:

incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
hash does not match id: want 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5, got 2818331716e8a5dd64a610d1a4f85c970fd8ae92f891d64625beaaa6072e1b84
github.com/restic/restic/internal/repository.Repack
        github.com/restic/restic/internal/repository/repack.go:37
main.pruneRepository
        github.com/restic/restic/cmd/restic/cmd_prune.go:242
main.runPrune
        github.com/restic/restic/cmd/restic/cmd_prune.go:62
main.glob..func19
        github.com/restic/restic/cmd/restic/cmd_prune.go:27
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/cobra/command.go:838
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/cobra/command.go:943
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/cobra/command.go:883
main.main
        github.com/restic/restic/cmd/restic/main.go:86
runtime.main
        runtime/proc.go:204
runtime.goexit
        runtime/asm_amd64.s:1374

Thanks to the excellent support forum, I was able to resolve this issue by dropping a single snapshot.

First, I identified the snapshot which contained the offending pack:

$ restic -r sftp:hostname.local: find --pack 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5
repository b0b0516c opened successfully, password is correct
Found blob 2beffa460d4e8ca4ee6bf56df279d1a858824f5cf6edc41a394499510aa5af9e
 ... in file /home/francois/.local/share/akregator/Archive/http___udd.debian.org_dmd_feed_
     (tree 602b373abedca01f0b007fea17aa5ad2c8f4d11f1786dd06574068bf41e32020)
 ... in snapshot 5535dc9d (2020-06-30 08:34:41)

Then, I could simply drop that snapshot:

$ restic -r sftp:hostname.local: forget 5535dc9d
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  1 / 1 files deleted

and run the prune command to remove the snapshot, as well as the incomplete packs that were also mentioned in the above output but could never be removed due to the other error:

$ restic -r sftp:hostname.local: prune
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[20:11] 100.00%  77439 / 77439 packs
incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
repository contains 77434 packs (2384522 blobs) with 367.648 GiB
processed 2384522 blobs: 1165510 duplicate blobs, 47.331 GiB duplicate
load all snapshots
find data that is still in use for 15 snapshots
[1:11] 100.00%  15 / 15 snapshots
found 1006062 of 2384522 data blobs still in use, removing 1378460 blobs
will remove 5 invalid files
will delete 13728 packs and rewrite 15140 packs, this frees 142.285 GiB
[4:58:20] 100.00%  15140 / 15140 packs rewritten
counting files in repo
[18:58] 100.00%  50164 / 50164 packs
finding old index files
saved new indexes as [340cb68f 91ff77ef ee21a086 3e5fa853 084b5d4b 3b8d5b7a d5c385b4 5eff0be3 2cebb212 5e0d9244 29a36849 8251dcee 85db6fa2 29ed23f6 fb306aba 6ee289eb 0a74829d]
remove 190 old index files
[0:00] 100.00%  190 / 190 files deleted
remove 28868 old packs
[1:23] 100.00%  28868 / 28868 files deleted
done

22 Nov 2020 7:30pm GMT

16 Nov 2020

feedPlanet Lisp

Michał Herda: Damn Fast Priority Queue: a speed-oriented priority queue implementation

I think I have accidentally outperformed all of the Quicklisp priority queue implementations. Enter Damn Fast Priority Queue.

Detailed description and benchmarks are available on the GitHub repository. It seems that my implementation is consistently an order of magnitude faster than most of the other priority heaps (with Pileup being the runner-up, only being about 3-4x slower than DFPQ).

16 Nov 2020 10:18pm GMT

27 Apr 2020

feedPlanet Sun

The Hubble Space Telescope celebrates its 30th birthday

For 30 years, the space telescope Hubble has provided the most impressive images from the vast wide space. The telescope was developed by the US space agency NASA and its European counterpart ESA. Hubble started its journey into space on 24 April 1990. With the help of the space shuttle "Discovery" it was lifted into ... Read more

27 Apr 2020 3:47pm GMT

26 Apr 2019

feedPlanet Sun

First Image of a Black Hole – Event Horizon

The Event Horizon Telescope (EHT) - a planet-scale array of 8 ground-based radio telescopes and part of an international collaboration - captured the first image of a black hole. On April 10th 2019, EHT researchers disclosed the first direct visual evidence of a supermassive black hole in the heart of the Galaxy Messier 87.

26 Apr 2019 2:32am GMT

04 Nov 2018

feedPlanet Sun

5 Budget-Friendly Telescopes You Can Choose For Viewing Planets

Socrates couldn't have been more right when he said: "I know one thing, that I know nothing." Even with all of the advancements we, as a species, have made in this world, it's still nothing compared to countless of wonders waiting to be discovered in the vast universe. If you've recently developed an interest in ... Read more

04 Nov 2018 1:27pm GMT

10 Nov 2011

feedPlanetJava

OSDir.com - Java: Oracle Introduces New Java Specification Requests to Evolve Java Community Process

From the Yet Another dept.:

To further its commitment to the Java Community Process (JCP), Oracle has submitted the first of two Java Specification Requests (JSRs) to update and revitalize the JCP.

10 Nov 2011 6:01am GMT

OSDir.com - Java: No copied Java code or weapons of mass destruction found in Android

From the Fact Checking dept.:

ZDNET: Sometimes the sheer wrongness of what is posted on the web leaves us speechless. Especially when it's picked up and repeated as gospel by otherwise reputable sites like Engadget. "Google copied Oracle's Java code, pasted in a new license, and shipped it," they reported this morning.



Sorry, but that just isn't true.

10 Nov 2011 6:01am GMT

OSDir.com - Java: Java SE 7 Released

From the Grande dept.:

Oracle today announced the availability of Java Platform, Standard Edition 7 (Java SE 7), the first release of the Java platform under Oracle stewardship.

10 Nov 2011 6:01am GMT

08 Nov 2011

feedfosdem - Google Blog Search

papupapu39 (papupapu39)'s status on Tuesday, 08-Nov-11 00:28 ...

papupapu39 · http://identi.ca/url/56409795 #fosdem #freeknowledge #usamabinladen · about a day ago from web. Help · About · FAQ · TOS · Privacy · Source · Version · Contact. Identi.ca is a microblogging service brought to you by Status.net. ...

08 Nov 2011 12:28am GMT

05 Nov 2011

feedfosdem - Google Blog Search

Write and Submit your first Linux kernel Patch | HowLinux.Tk ...

FOSDEM (Free and Open Source Development European Meeting) is a European event centered around Free and Open Source software development. It is aimed at developers and all interested in the Free and Open Source news in the world. ...

05 Nov 2011 1:19am GMT

03 Nov 2011

feedfosdem - Google Blog Search

Silicon Valley Linux Users Group – Kernel Walkthrough | Digital Tux

FOSDEM (Free and Open Source Development European Meeting) is a European event centered around Free and Open Source software development. It is aimed at developers and all interested in the Free and Open Source news in the ...

03 Nov 2011 3:45pm GMT

28 Oct 2011

feedPlanet Ruby

O'Reilly Ruby: MacRuby: The Definitive Guide

Ruby and Cocoa on OS X, the iPhone, and the Device That Shall Not Be Named

28 Oct 2011 8:00pm GMT

14 Oct 2011

feedPlanet Ruby

Charles Oliver Nutter: Why Clojure Doesn't Need Invokedynamic (Unless You Want It to be More Awesome)

This was originally posted as a comment on @fogus's blog post "Why Clojure doesn't need invokedynamic, but it might be nice". I figured it's worth a top-level post here.

Ok, there's some good points here and a few misguided/misinformed positions. I'll try to cover everything.

First, I need to point out a key detail of invokedynamic that may have escaped notice: any case where you must bounce through a generic piece of code to do dispatch -- regardless of how fast that bounce may be -- prevents a whole slew of optimizations from happening. This might affect Java dispatch, if there's any argument-twiddling logic shared between call sites. It would definitely affect multimethods, which are using a hand-implemented PIC. Any case where there's intervening code between the call site and the target would benefit from invokedynamic, since invokedynamic could be used to plumb that logic and let it inline straight through. This is, indeed, the primary benefit of using invokedynamic: arbitrarily complex dispatch logic folds away allowing the dispatch to optimize as if it were direct.

Your point about inference in Java dispatch is a fair one...if Clojure is able to infer all cases, then there's no need to use invokedynamic at all. But unless Clojure is able to infer all cases, then you've got this little performance time bomb just waiting to happen. Tweak some code path and obscure the inference, and kablam, you're back on a slow reflective impl. Invokedynamic would provide a measure of consistency; the only unforeseen perf impact would be when the dispatch turns out to *actually* be polymorphic, in which case even a direct call wouldn't do much better.

For multimethods, the benefit should be clear: the MM selection logic would be mostly implemented using method handles and "leaf" logic, allowing hotspot to inline it everywhere it is used. That means for small-morphic MM call sites, all targets could potentially inline too. That's impossible without invokedynamic unless you generate every MM path immediately around the eventual call.

Now, on to defs and Var lookup. Depending on the cost of Var lookup, using a SwitchPoint-based invalidation plus invokedynamic could be a big win. In Java 7u2, SwitchPoint-based invalidation is essentially free until invalidated, and as you point out that's a rare case. There would essentially be *no* cost in indirecting through a var until that var changes...and then it would settle back into no cost until it changes again. Frequently-changing vars could gracefully degrade to a PIC.

It's also dangerous to understate the impact code size has on JVM optimization. The usual recommendation on the JVM is to move code into many small methods, possibly using call-through logic as in multimethods to reuse the same logic in many places. As I've mentioned, that defeats many optimizations, so the next approach is often to hand-inline logic everywhere it's used, to let the JVM have a more optimizable view of the system. But now we're stepping on our own feet...by adding more bytecode, we're almost certainly impacting the JVM's optimization and inlining budgets.

OpenJDK (and probably the other VMs too) has various limits on how far it will go to optimize code. A large number of these limits are based on the bytecoded size of the target methods. Methods that get too big won't inline, and sometimes won't compile. Methods that inline a lot of code might not get inlined into other methods. Methods that inline one path and eat up too much budget might push out more important calls later on. The only way around this is to reduce bytecode size, which is where invokedynamic comes in.

As of OpenJDK 7u2, MethodHandle logic is not included when calculating inlining budgets. In other words, if you push all the Java dispatch logic or multimethod dispatch logic or var lookup into mostly MethodHandles, you're getting that logic *for free*. That has had a tremendous impact on JRuby performance; I had previous versions of our compiler that did indeed infer static target methods from the interpreter, but they were often *slower* than call site caching solely because the code was considerably larger. With invokedynamic, a call is a call is a call, and the intervening plumbing is not counted against you.

Now, what about negative impacts to Clojure itself...

#0 is a red herring. JRuby supports Java 5, 6, and 7 with only a few hundred lines of changes in the compiler. Basically, the compiler has abstract interfaces for doing things like constant lookup, literal loading, and dispatch that we simply reimplement to use invokedynamic (extending the old non-indy logic for non-indified paths). In order to compile our uses of invokedynamic, we use Rémi Forax's JSR-292 backport, which includes a "mock" jar with all the invokedynamic APIs stubbed out. In our release, we just leave that library out, reflectively load the invokedynamic-based compiler impls, and we're off to the races.

#1 would be fair if the Oracle Java 7u2 early-access drops did not already include the optimizations that gave JRuby those awesome numbers. The biggest of those optimizations was making SwitchPoint free, but also important are the inlining discounting and MutableCallSite improvements. The perf you see for JRuby there can apply to any indirected behavior in Clojure, with the same perf benefits as of 7u2.

For #2, to address the apparent vagueness in my blog post...the big perf gain was largely from using SwitchPoint to invalidate constants rather than pinging a global serial number. Again, indirection folds away if you can shove it into MethodHandles. And it's pretty easy to do it.

#3 is just plain FUD. Oracle has committed to making invokedynamic work well for Java too. The current thinking is that "lambda", the support for closures in Java 7, will use invokedynamic under the covers to implement "function-like" constructs. Oracle has also committed to Nashorn, a fully invokedynamic-based JavaScript implementation, which has many of the same challenges as languages like Ruby or Python. I talked with Adam Messinger at Oracle, who explained to me that Oracle chose JavaScript in part because it's so far away from Java...as I put it (and he agreed) it's going to "keep Oracle honest" about optimizing for non-Java languages. Invokedynamic is driving the future of the JVM, and Oracle knows it all too well.

As for #4...well, all good things take a little effort :) I think the effort required is far lower than you suspect, though.

14 Oct 2011 2:40pm GMT

07 Oct 2011

feedPlanet Ruby

Ruby on Rails: Rails 3.1.1 has been released!

Hi everyone,

Rails 3.1.1 has been released. This release requires at least sass-rails 3.1.4

CHANGES

ActionMailer

ActionPack

ActiveModel

ActiveRecord

ActiveResource

ActiveSupport

Railties

SHA-1

You can find an exhaustive list of changes on github. Along with the closed issues marked for v3.1.1.

Thanks to everyone!

07 Oct 2011 5:26pm GMT

26 Jul 2008

feedFOSDEM - Free and Open Source Software Developers' European Meeting

Update your RSS link

If you see this message in your RSS reader, please correct your RSS link to the following URL: http://fosdem.org/rss.xml.

26 Jul 2008 5:55am GMT

25 Jul 2008

feedFOSDEM - Free and Open Source Software Developers' European Meeting

Archive of FOSDEM 2008

These pages have been archived.
For information about the latest FOSDEM edition please check this url: http://fosdem.org

25 Jul 2008 4:43pm GMT

09 Mar 2008

feedFOSDEM - Free and Open Source Software Developers' European Meeting

Slides and videos online

Two weeks after FOSDEM and we are proud to publish most of the slides and videos from this year's edition.

All of the material from the Lightning Talks has been put online. We are still missing some slides and videos from the Main Tracks but we are working hard on getting those completed too.

We would like to thank our mirrors: HEAnet (IE) and Unixheads (US) for hosting our videos, and NamurLUG for quick recording and encoding.

The videos from the Janson room were live-streamed during the event and are also online on the Linux Magazin site.

We are having some synchronisation issues with Belnet (BE) at the moment. We're working to sort these out.

09 Mar 2008 3:12pm GMT