18 Dec 2024
planet.freedesktop.org
Peter Hutterer: A new issue policy for libinput - closing and reopening issues for fun and profit
This is a heads up that if you file an issue in the libinput issue tracker, it's very likely this issue will be closed. And this post explains why that's a good thing, why it doesn't mean what you want, and most importantly why you shouldn't get angry about it.
Unfixed issues have, roughly, two states: they're either waiting for someone who can triage and ideally fix it (let's call those someones "maintainers") or they're waiting on the reporter to provide some more info or test something. Let's call the former state "actionable" and the second state "needinfo". The first state is typically not explicitly communicated but the latter can be via different means, most commonly via a "needinfo" label. Labels are of course great because you can be explicit about what is needed and with our bugbot you can automate much of this.
Alas, using labels has one disadvantage: GitLab does not allow the typical bug reporter to set or remove labels - you need to have at least the Planner role in the project (or group) and, well, suprisingly reporting an issue doesn't mean you get immediately added to the project. So setting a "needinfo" label requires the maintainer to remove the label. And until that happens you have a open bug that has needinfo set and looks like it's still needing info. Not a good look, that is.
So how about we use something other than labels, so the reporter can communicate that the bug has changed to actionable? Well, as it turns out there is exactly thing a reporter can do on their own bugs other than post comments: close it and re-open it. That's it [1]. So given this vast array of options (one button!), we shall use them (click it!).
So for the forseeable future libinput will follow the following pattern:
- Reporter files an issue
- Maintainer looks at it, posts a comment requesting some information, closes the bug
- Reporter attaches information, re-opens bug
- Maintainer looks at it and either: files a PR to fix the issue or closes the bug with the wontfix/notourbug/cantfix label
Obviously the close/reopen stage may happen a few times. For the final closing where the issue isn't fixed the labels actually work well: they preserve for posterity why the bug was closed and in this case they do not need to be changed by the reporter anyway. But until that final closing the result of this approach is that an open bug is a bug that is actionable for a maintainer.
This process should work (in libinput at least), all it requires is for reporters to not get grumpy about issue being closed. And that's where this blog post (and the comments bugbot will add when closing) come in. So here's hoping. And to stave off the first question: yes, I too wish there was a better (and equally simple) way to go about this.
[1] we shall ignore magic comments that are parsed by language-understanding bots because that future isn't yet the present
18 Dec 2024 3:21am GMT
17 Dec 2024
planet.freedesktop.org
Donnie Berkholz: The lazy technologist’s guide to staying healthy
TL;DR - I've lost a ton of weight from mid-2023 to early 2024 and maintained the vast majority of that loss. I've also begin exercising and had great results in my fitness and strength. Here, I'm sharing what I've learned as well as a bunch of my tips and tricks. Overall on the diet side, it's about eating a wide variety and healthy ratio of colorful, minimally processed whole foods, with natural flavor and sweetness, only during meals. On the exercise side, I do both cardio and resistance training. For cardio, I focus on post-meal, moderate-intensity cardio (specifically, 1-mile brisk walks). For strength training, I use calisthenics-based compound exercises (complex multi-muscle movements) 2x/wk, performing a single set to near-exhaustion. I've optimized this down from 3 sets 3x/wk, based on my experience and academic research in the area.
In the past 18 months, I've lost 75 pounds and gone from completely sedentary to fit, while minimizing the effort to do so (but needing a whole lot of persistence and grit). On the fitness side, I've taken my cardiorespiratory fitness from below average to high, and I'm stronger than I've been in my entire life. Again I've aimed to do so with maximum efficiency, shooting for the 80% of value with 20% of effort.
Here's what I wrote in my initial post on weight loss:
I have no desire to be a bodybuilder, but I want to be in great shape now and be as healthy and mobile as possible well into my old age. And a year ago, my blood pressure was already at pre-hypertension levels, despite being at a relatively young age.
Research shows that 5 factors are key to a long life - extending your life by 12-14 years:
- Never smoking
- BMI of 15.5-24.9
- 30+ min a day of moderate/vigorous exercise
- Moderate alcohol intake (vs none, occasional, or heavy)
- Unsurprisingly, there is vigorous scientific and philosophical/religious/moral debate about this one. However all studies agree that heavy drinking is bad, so ensure you avoid that.
- Diet quality in the upper 40% (Alternate Healthy Eating Index)
Additionally, people who are in good health have a much shorter end-of-life period. This means they can enjoy a longer healthy part of their lives (the "healthspan") and squeeze the toughest times into a shorter period right at the end. After seeing many seniors struggle for years as they got older, I wanted my own story to end differently.
Although I'm no smoker, I lacked three other factors. My weight was incredibly unhealthy, I was completely sedentary, and my diet was terrible. I do drink moderately, however (nearly all beer).
This post accompanies my earlier writeups, "The lazy technologist's guide to weight loss" and "The lazy technologist's guide to fitness" Check them out for in-depth, science-driven reviews of my experience losing weight and getting fit.
Why is this the lazy technologist's guide, again? I wanted to lose weight in the "laziest" way possible - in the same sense that lazy programmers work to find the most efficient solutions to problems. I'll reference an apocryphal quote by Bill Gates and a real one by Larry Wall, creator of Perl. Gates supposedly said, "I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it." Wall wrote in Programming Perl, "Laziness: The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it."
What's the lowest-effort, most research-driven way to lifelong health, whether you're losing weight, getting in shape, or trying to maintain your current healthy weight or state after putting in a whole lot of time and effort getting there? Discovering and executing upon that was my journey. Read on if you're considering taking a similar path.
Hitting my goals
Since my posts early this year, I broke through into my target ranges for both maintenance weight and fitness. In mid-April, I hit a low of 164 lbs. Since then, I've been gradually transitioning into maintenance mode, hovering within ~10 lbs of my low. As I write this, I'm about 10 pounds above my minimum weight, at a current BMI of 23. At my lowest, I had a BMI around 22.
On the fitness side, in late May, I broke into the VO2Max range for high cardiorespiratory fitness. (In my case, that's 47 based on my age and gender, as measured by my Apple Watch.)
In the next few sections, I'll share how I've continued to change what I eat and how I work out to keep improving my overall health.
Evolving what I eat for long-term health
In this section, I'll share a lot of what I've learned regarding how to eat healthier. There's a lot to it, from focusing on whole foods with enough protein and fiber to eating enough veggies and managing portion sizes, so dig in for all the details!
Keep up the protein
As I wrote in the post on weight loss, high protein is a great way to lose weight and maintain or build muscle. Protein also promotes fullness, so I've shifted my diet so that every meal (breakfast included) has a good amount of protein - targeting 25%-30% of daily calories. Previously, I used to get quite hungry in the late morning, before it was time to eat lunch. That's no longer a concern even when I'm on a caloric deficit, let alone eating at maintenance.
Use Mediterranean plate ratios
Although I'm not officially eating a Mediterranean diet, I've found its plate ratios to be incredibly valuable:
- 1/2 vegetable
- 1/4 lean protein (white meat, seafood, lentils/beans)
- 1/4 starchy carb (whole grains or starchy vegetables, avoiding white/processed grains)
Building meals that way makes it very hard for me to overeat, because the vegetables are so high-volume and low-calorie that they take up a lot of space in my stomach. Following this guideline is especially helpful at restaurants, which I'll detail later.
My main exception is breakfast, where I do incorporate veggies but not as half of my meal. Veggies plus fruits are certainly half of it, though.
Count calories for a while, and then set a permanent weight-gain trigger
After overeating for a sizable fraction of my lifetime, and then eating at a large deficit for a year, I need to teach myself what sustainable eating habits look like because they clearly aren't intuitive, for me. The "intuitive eating" trend may work for people who already have a habit of healthy eating and weight maintenance, but not for the rest of us - our intuition is broken from years or decades of bad habits.
As a result, calorie counting at maintenance is a good practice to learn what the correct amount of food per day looks and feels like.
My plan is to continue counting calories at maintenance until I'm confident that I'm no longer gaining weight, and then stop. However, that raises the risk that my weight could then start increasing again, because it's incredibly common for people to re-gain the weight they've lost. Around 80%-90% of people fail to maintain their weight loss - mostly those who don't exercise and stop tracking their eating/weight. There's great studies on the US National Weight Control Registry about the habits of people who keep their weight off.
As a process control, I'm going to continue weighing myself daily. I'm setting an upper limit of 5 pounds above my target weight that will trigger me to begin calorie counting again. To avoid reacting to the random deviations that accompany daily weight, I've started using a specialized app called Happy Scale that is designed for creating smoothed trends for body weight. You could also do this in a spreadsheet, but I like the ease of use of this app.
Dine out at restaurants, safely
Eating out at restaurants (or getting takeout/takeaway) is a challenge that a lot of people on diets - or just trying to eat healthy - can't figure out how to make work. A lot of people just give up and always order a salad. Surprisingly, that can trick you into thinking you're eating healthy without actually doing so. I've created a set of guidelines that I follow when eating out:
- Aim for lean protein & veggies, prepared simply (e.g. grilled, roasted, sautéed, steamed).
- Always start with veggies. If your meal doesn't come with them, order a starter salad or veggies as an appetizer.
- Minimize high-fat, calorie-dense sauces & toppings. Watch out for anything based on cream (like Alfredo sauce), cheese, mayo (aioli), oil, or butter. A little bit of a high-flavor cheese is great (like finely grated parmesan, or crumbled feta/goat cheese), but avoid the cheese sauce or a big pile of shredded cheese. Get meals served with tomato-based sauces, slices of lime/lemon or just spices/seasonings, which bring tons of flavor without the calories. If it comes with a calorie-dense sauce, ask for it on the side and dip your bites instead of getting your meal drenched in it. You'll often be shocked by how big of a cup they provide for the sauce, which would've been coating your food.
- In salads, always get dressing on the side, and prefer oil & vinegar or a vinaigrette. Do the same with any other high-fat sauces - get them on the side. That way, you're in control of the portion, or you can just dip bites. Salad dressings can have hundreds of calories in them. If you add a huge pile of cheese and croutons, and maybe some processed meats like pepperoni or some oil-covered pasta, then you've just turned a healthy meal into the opposite.
- Avoid breaded, deep-fried foods. This includes the protein as well as French fries or chips/crisps.
- Don't eat the table bread when it comes out first at a restaurant. Eat veggies, then protein, and only then starchy carbs. Remember, only 1/4 of your meal can be starchy carbs, according to the Mediterranean plate ratio (bread, rice, pasta, potatoes, etc).
- Avoid meals that are 1/2 or more starchy carbs.
- Only eat half of what you order, because restaurant portions are massive. Restaurant portions are big enough for 2 meals, sometimes 3. Split it physically on your plate when you get it, and ask for a box as soon as possible.
As one example, I love burgers. When I order one, I'll look for a healthier, simpler option instead of the one with 15 fatty add-ons, I'll stick with a single patty instead of a double, and I'll often ask for the aioli on the side. That way, I can lightly dip each bite if it needs the flavor. I'll frequently get a turkey or bison patty instead of beef, and I'll often order it without a bun - either on a bed of lettuce (eaten with a fork & knife), or wrapped in lettuce instead of the bun. For the side, instead of fries, I'll get a side salad (no croutons, no cheese, vinaigrette on the side), veggies, or fruit. Sometimes I'll get coleslaw or a lower-calorie soup, when that's the best option. I allow myself one "extra" from my guidelines, and it's usually getting cheese on the burger (the other toppings are veggies).
As another, noodle/rice dishes at Italian, Indian or SE Asian restaurants (Chinese, Japanese, Vietnamese, Japanese, etc) are common. Get a stir-fry, add lots of veggies, get the grilled/roasted chicken or seafood, avoid the buttery/creamy sauces, and/or eat less of the rice/noodle part of the dish. If you get sushi, prefer sashimi and rolls over nigiri, which has a lot more rice. When you do order starchy carbs, prefer the whole-grain version when possible (brown rice, whole-wheat pasta, etc). When you can make it work, first eat the veggies, then protein, then grains.
Sometimes you're stuck at a place that doesn't fit any of those guidelines. Fast-food restaurants like McDonald's, Burger King, or Dairy Queen have no healthy meal options - no grilled chicken, no salad or wraps without fried food, etc. In those cases, I'll order smaller portions, like a kid's meal, or a single cheeseburger and the smallest size of fries (the one that comes in a little bag instead of a fry holder), with a cup of water. Another option is a double fish sandwich, if you order it without tartar sauce and skip the bun. You can probably manage a meal around 500-600 calories, but you'll be hungry because you hardly got any veggies or fiber, so you're missing out on fullness signals. You also will have eaten all kinds of ultraprocessed ingredients instead of healthy whole foods, which we'll discuss later.
Eat like it's the 1950s
In the US, if you go back to before we had ultraprocessed foods, people ate very differently. Most of that emerged in the 1960s and really gained popularity in the 1970s, so let's return to the 1950s.
Eat a savory breakfast
Before we had overwhelmingly sugar-doused cereal, people often ate breakfast differently. It might be leftovers from the night before, or it could be oatmeal, peanut-butter toast, or something like eggs & bacon. In general, breakfasts were much more savory than sweet.
I've adopted that philosophy, shifting away from breakfasts like sweet cereal or flavored yogurt (both with plenty of added sugar) to a more savory approach, or at least foods with no sugar added. Most often, I'll have something with eggs and beans, as well as a separate bowl with berries and plain skyr or Greek yogurt. The fruit adds plenty of flavor and sweetness, so there's no need to add any more from sugar/honey/etc.
Eliminate snacks
Before ultraprocessed foods, snacks also weren't really a thing. There weren't food companies trying to create opportunities for profit through people eating outside of typical meals. You'd eat breakfast, lunch, and dinner, and that was it. Eating random snacks throughout the day didn't really exist, although some families might have an extra mini-meal of sorts at some point.
Decrease portion size by decreasing plate size
Additionally, portion sizes have increased dramatically. In part, this is because plateware has increased in size. For example, the diameter of plates has increased from 9″ in the 1950s to 10.5″-11″ between the 1980s and 2000, and as much as 12″-13″ today. People will subconsciously take larger portions and eat more calories when their plates are larger, as academic studies have shown.
This brings us to another easy thing I did to eat healthier - reduced the size of my plates, bowls, and glasses. Even without buying new plates, I started only adding food to the "inner ring" instead of all the way to the edge, and stopped piling anything on top of other food. I bought new, smaller bowls and glasses, because those were harder to manage. And when I eat out or get takeout, I have a mental baseline to compare to their plate sizes. I also watch out for the use of multiple courses to keep me from thinking about how much I'm eating.
To sum up, I switched to a savory breakfast, eliminated snacks outside of meals, and reduced the size of my plateware. Even if you only do 1 or 2 of those 3 things, that'll make a meaningful difference.
Ultraprocessed foods trick your body
I've read quite a bit about ultraprocessed foods. The summary is that they are effectively ways to trick your body into thinking it's getting something that's not really there. Artificial sweeteners, things with the taste & consistency of fat that have no fat, and artificial/natural flavors in foods that make your body expect something else are just a few examples.
Sugar that's not sugar
When your body tastes something sweet, it expects that it will soon get an influx of calories from sugar to digest. Artificial sweeteners mess with this, tricking your body. A number of studies have shown that people tend to make up for these "lost" calories by subconsciously eating more later that day. It's possible to prevent this with strict calorie counting, but it's a bias you want to be aware of. It's also unclear what these mixed signals will do to your body over the long term, when it can't tell what calories to expect based on what you taste. As a result, I've begun avoiding alternative sweeteners, and just getting something with sugar if that's really what I want.
Fat that's not fat
This one is especially sneaky, because you can't always spot it in the ingredient labels. Using seemingly normal ingredients, companies have created fat substitutes with unique structures that provide the same sort of mouth-feel as fat, without containing the expected levels of fat. These can come in the easily identifiable varieties such as all sorts of "gum" - this results in ice cream that basically doesn't melt, for example. "Whey protein concentrate" is another common one, as is anything with "dextrin" in the name and a variety of emulsifiers such as "polyesters." You need to work (and typically pay a premium) for things like ice cream or chocolate with a simple ingredient list, because natural ingredients cost more and often don't transport as well.
Flavor that's not flavor
Flavors in the wrong foods are another example of tricking your body into expecting a different set of nutrients than it gets. This can cause you to develop craving for unhealthy foods, based on your desire for a particular flavor profile that comes from added flavorings. For example, you might want some orange-, apple-, or grape-flavored drink instead of actual oranges, apples, or grapes. Your body will cause you to crave certain things based upon their nutrient profile, and what your body needs. This is most obvious in pregnancy and in studies done on babies/toddlers, given free choice on what to eat.
Micronutrients that don't belong
"Enriched" foods are stealing your health, again based on artificially induced cravings for unnaturally added ingredients. A good case study here is flour in bread. In the early 1940s, the United States passed a law requiring enrichment of bread flour to prevent diseases around missing micronutrients (e.g. folic acid, niacin, thiamin, riboflavin, iron). Italy, however, did no such thing - instead, it focused on educating its citizens on healthy diet components. As a result, Italians eat far more beans than Americans, for example, which contain many of the same missing micronutrients. Americans instead eat far more white bread than they should - an ultraprocessed food that our body desires because of the added micronutrients that don't belong.
Salt that's over the top
Overly salty foods are another danger area. In the US, the recommended amount is 2300 mg/day, which it's quite easy to hit even while trying to avoid extra-salty foods. For example, I mostly don't eat ramen, other soups, preserved meats like smoked salmon or beef jerky, or frozen meals. Another surprising one is sugar-sweetened beverages like soda, which have so much sugar that they also add salt to trick you into feeling like they aren't that sweet.
Optimize for gut health
Another area that's become increasingly visible in the past couple of decades is the importance of gut microbiota in health. Keeping them healthy is critical to being healthy. That's come down to a few key factors for me: fibers, fermented food, and reduced alcohol.
Eat enough fiber - and that's a lot!
The average American only eats 10-15 grams of fiber per day, while the recommended daily allowance for adults up to age 50 is 25 grams for women and 38 grams for men. I see this as a general correlation with our consumption of ultraprocessed foods, because fiber is primarily present in whole foods. Whole fruits, whole vegetables, whole grains, legumes (beans/lentils), and seeds are among the best sources of fiber.
As soon as you stop eating the whole, unprocessed food and replace it with something more processed, you lose the benefits. Make sure to eat the whole fruit, including the edible portion of the skin. Even something as simple as making a fruit smoothie or fruit juice will chop up or remove the fiber and other long-chain complex molecules, reducing its nutritional value. Personally, I found it surprisingly hard to modify my diet enough to get enough fiber while I was losing weight, because I'd been eating ultraprocessed foods for so long. In the end, the main things I added were raspberries & blackberries, chia seeds, broccoli & cauliflower, and beans/lentils.
Among fruit, raspberries and blackberries are particularly high fiber (you can tell from all the seeds as you're eating them). Other great options include apples, oranges, pears, grapefruit, and kiwifruit, as long as you eat the edible portion of the skin & rind. Passion fruit is an all-star with many times more fiber, but it's quite expensive. Dried fruit can be a great complement to fresh fruit in moderation - especially golden berries (another all-star), plums, and apricots. It's easy to eat too much dried fruit, though, because all the water's been removed so it doesn't fill you up as quickly. For example, you can eat 5 dried apricots in a few minutes, but imagine eating 5 fresh apricots in a row.
Vegetables are another great source of fiber, but again you need to focus on the right ones. Among non-starchy options (basically anything but root vegetables), broccoli and cauliflower are great choices, as is kale. I like to begin my meals with one of those, whenever I can. Among starchy options, sweet potatoes, carrots, and corn are great choices.
Whole grains (such as whole-wheat bread, the denser the better, and brown rice) are also high in fiber, but they tend to have lots more carbs - while I optimized more for protein. When I'm eating at maintenance, I occasionally have some dense whole-grain breads such as a Danish pumpernickel or a German roggenbrot/vollkornbrot. They're nothing like your typical American pumpernickel or rye, so try to find a bakery near you that offers them. Otherwise, any 100% whole-grain bread (they often have a stamp) with low sugar and a decent amount of protein & fiber are a good option. Any bread with no sugar is even better, but it's hard to find. I'd recommend checking out local bakeries first, then the bakery within your favorite grocery store, followed by national brands such as Dave's Killer Bread or Ezekiel from Food for Life.
Legumes & seeds are a great source - I've saved perhaps the best for last. Beans and lentils are fiber superstars - a single serving around 100 calories could have 5-10 grams. They also offer a complete set of protein (all 20 amino acids) when combined with a whole-grain rice, such as brown, red or purple. I have a serving of black beans with eggs almost every day.
Fermented foods improve gut health
Another great way to improve the types of gut microbiota is eating more fermented foods. These are more common things in the US like yogurts and sauerkraut, as well as cultural food like Korean kimchi, increasingly popular drinks like kombucha, and less common drinks like kefir (basically drinkable yogurt). The benefits seem to fade away after just a few weeks though, so it's important to maintain consumption instead of thinking you can transform your microbiota once and then you're done.
I'm regularly eating skyr, which is a thick Icelandic yogurt with as much protein as Greek yogurt but not the tangy, bitter flavor. It's a great protein-dense option, even when you eat the version based on whole milk (which I do). I'm also occasionally using kimchi on my eggs or drinking a small half-glass of kefir. Sauerkraut is reserved for summer barbecues, and I haven't gotten into kombucha at this point.
Moderate your alcohol intake
Another thing that made a big difference was reducing the amount of alcohol I drink. Cutting this down from a beer every day to more like once a week has made a big difference. I'm overall feeling more energetic and my gut's much healthier too.
Appreciate the sweeter (natural) things in life
As I learned more about eating healthy, I came across increasing amounts of material about how added sugar caused major problems, leading toward obesity or diabetes. Interestingly, many parts of the world eat far less sweet food, and there tends to be a general correlation between consumption of ultraprocessed sugary food and obesity. In my own life, I've noticed this difference in practice when traveling to Europe and Asia, where many of the desserts are far less sweet (and the obesity rate is much lower). Two great examples are Polish cheesecake (sernik) - which is far less sweet than American cheesecake - and the frequent use of less-sweet ingredients in Asia such as red bean, sesame, or glutinous rice.
Based on this, I've cut down on foods with added sugar. Natural levels of sugar are generally fine, such as that in many fruits, but even then I try to bias toward less-sweet options. For example, I'll typically have an apple or pear instead of mango. Among dried fruit, I avoid dates and figs, tending toward lower-sugar options instead.
Once you start looking, it's shocking how seemingly every processed food has added sugar. This goes all the way down to even basic staples such as bread, unless you specifically look for the rare breads without it.
In America, we've trained ourselves from birth (with sweetened baby food) to eat sweeter and sweeter foods with more and more unnatural levels of sugar, to the point where it tastes too sweet or even sickening to people from other cultures.
As a pleasant side effect of this, I find myself enjoying moderately sweet foods almost in the same way that I used to think of desserts. Fruit like strawberry or mango, chia pudding or overnight oats w/ fruit and no other sweetener, frozen Greek yogurt bars, skyr with cinnamon and just a little honey, dried fruit, trail mix, or 85%+ dark chocolate now taste great.
Try the "No S Diet"
While on my journey, I came across a simple approach called the "No S Diet" that I quite appreciated. It boils down healthy eating into just three rules and one exception:
- No Snacks
- No Sweets
- No Seconds
Except (sometimes) on days that start with "S"
Even this alone would get you a long, long way. Combining it with a Mediterranean diet (plate ratios, whole foods, lean protein) is almost all you need.
I have stopped snacking entirely, as mentioned earlier. I'm a bit more flexible on sweets, if they fit into my calories for the day, but I do try to save more of that for the weekend. For example, I might have a little 85%+ dark chocolate on a weekday after lunch, or some strawberries w/ whipped cream after dinner, but I'll eat a full dessert serving on the weekend.
Eat in the right order
Interestingly, I also learned that even the order in which you eat can make a difference. Specifically, you can flatten blood-sugar spikes by eating in a specific order: fiber, then protein, then starchy carbs.
For example, start with a salad, then eat the main portion of your entree (e.g. chicken or fish), followed by the sides (rice, potatoes or whatever).
This has served me well at home, but it's been especially helpful at restaurants. Every time I go out, I make a point of ordering either a salad or a veggie-based appetizer to enjoy before the main course. Eating in this specific order isn't the only reason that helps - it also uses up a bunch of the room in my stomach on veggies instead of more calorie-dense foods, so I'm often full enough before I finish my starchy carbs.
Add antioxidants
Antioxidants are another great way to eat healthier. These protect your body at a subcellular level from oxidizing reactions, which can damage parts of your cells (especially the mitochondria, basically your cell's energy factory) over time and contribute to aging.
An easy way to identify foods with higher levels of antioxidants is to look for more color. Instead of the bland-looking food, pick one with a stronger color. It could be dark green, red, orange, blue, purple, or something else - just avoid white and beige options within a food family. Although there are many exceptions, this is a good guideline.
Remember: eat the rainbow.
Go for whole grains and prefer resistant starches
Whole grains are hugely more valuable than the more processed options. You get the germ, which has a lot of the nutrients. With your typical American white bread, a lot of the healthy bits are removed (the germ and bran), leaving you with only the endosperm. With whole-grain bread, the germ and bran are also used, which keeps more of the fiber and micronutrients. This also reduces the blood-sugar spike after meals, which is another great benefit.
Another thing I learned is that there are different types of starches - rapidly digested, slowly digested, and resistant.
Resistant starch takes longer to digest, flattening some of the glucose spikes that can create hunger cravings a couple of hours after meals. Two of the best examples of those are whole grains (type 2 resistant starch, or RS2) as well as pasta, potatoes, or rice that's cooked and then cooled (type 3, or RS3).
One way to prefer resistant starch is to aim for foods that are higher in amylose and lower in amylopectin. Amylose is a single straight-chain polymer, so it takes longer to break down and digest, whereas amylopectin is branched with many ends (so it's faster to break down in parallel). That parallel breakdown means you get a sugar spike rather than spreading out the sugar over time. In general, this means whole grains over processed grains, and the more colorful versions of foods. Here are some examples:
- Bread: Whole-wheat/pumpernickel/rye > sourdough/multigrain/50% wheat > white
- Rice: Purple/black/red/wild > brown > long-grain white > short-grain white
- Pasta: Bean-based > whole-wheat > standard (durum)
- Potatoes: Stokes/Okinawan (purple inside) > sweet > white
- Oats: Steel-cut > rolled > instant
Another way to get more resistant starch is to eat more grains that were cooked and then cooled. Pasta salad, potato salad, grain bowls, and reheating leftover rice are a few common examples. Yes, that reheated Chinese stir-fry w/ rice can be healthier than it was when you ordered it!
Get nutrients from whole foods, not pills & powders
A lot of people try to add missing nutrients to their diet in the form of a multivitamin or a large variety of supplements. Unfortunately, research has shown that despite containing the same chemical compounds, this is frequently not a substitute. The bioavailability (the amount that actually makes it into your bloodstream) is often much higher when you eat these micronutrients as part of whole foods, rather than taking them in a pill or as powders.
Protein powder is another issue. A lot of people will make protein shakes or add protein powder to foods like yogurt to get enough protein. Unfortunately, protein powders are missing a lot of the nutrients that protein-based whole foods contain. For protein shakes specifically, the below point applies regarding drinking your calories and its poor effect on satiety. If whole foods aren't an option, I'd recommend looking into protein bars with high fiber rather than a liquid option. RXBar is my favorite protein bar because of its simple ingredient list, high protein & fiber content (12g protein, 5g fiber) and good flavor, and it's well-priced at Costco at ~$1.25/bar vs $2 elsewhere. When I need a packable meal replacement that doesn't require refrigeration, I'll usually grab an RXBar, a Wholesome Medley trail mix (from Whole Foods), and an apple or pear.
Avoid drinking your calories - focus on low- or no-calorie beverages
Overall, drinking calories can confuse your body into consuming too many calories in a day. Your primary beverage should be water. That should be complemented primarily by low-calorie, unsweetened options like coffee or tea (potentially with milk and minimal sugar).
Drop the sugar-sweetened beverages, like soda
Soda and other sugar-sweetened beverages are not recognized by the body as consumed calories. When you consume 500 calories soda, you're likely to increase your total daily consumption by 500 calories (gaining weight) instead of eating less food later. Not to mention, if you drink sugar-sweetened beverages frequently throughout the day, you're also destroying your teeth and potentially giving yourself diabetes.
Eat your meals instead of blending them into smoothies
Smoothies destroy much of the nutritional value in whole fruit, such as fibers and other complex molecules, because it's ground up into tiny bits by a blender. They also make it much easier to consume far more than you normally would. How much fruit goes into a single smoothie, compared to how many whole fruits you would eat in a single sitting?
Drop the sugary alcoholic drinks
Alcohol is another place to be careful. Cocktails are full of sugar from the simple syrup. Trying to save calories by getting a basic mixed drink with Diet Coke? Then you've got artificial sweeteners. Your best liquor-based option is probably a mix with soda water and lime - things like a vodka soda, ranch water, gin Rickey, or whiskey highball. High-alcohol beers have incredibly high calorie counts as well. There are some good options for low-calorie or non-alcoholic beer, which I covered in an earlier post.
For coffee, stick to the classics in the smallest size (4-8 oz)
Coffee-based drinks can be incredibly high-calorie, especially in the US. Mochas and blended/frozen drinks can be 500-1000 calories or more, for a single drink. This is especially harmful because of the American tendency to order the largest size instead of the smallest - it's a better deal, right? A Starbucks Java Chip Frappuccino is 560 calories for a venti (large). But this pales in comparison to Caribou Coffee, which offers drinks like the Turtle Mocha for 960 cal (L) / 1140 (XL) or the Caramel Caribou Cooler for 830 cal (L) / 1050 (XL). At Dunkin', you can get the Triple Mocha Frozen Coffee at 1100 cal (L) and the Caramel Creme Frozen Coffee at 1120 cal (L). So keep your eyes open on any specialty coffees.
When drinking coffee, go for the classics. If you don't like black coffee or espresso, then get a latte, cappuccino, flat white, cortado, or espresso macchiato. Out of those, lattes have the most milk (so the most calories), while espresso macchiatos have the least.
Also, order the smallest possible size - this is also the most authentic size, with a better ratio of espresso to milk. Starbucks carries a short size (8 oz) that isn't on their printed menu, but unfortunately many other chains only offer 12 oz as their smallest size. Third-wave coffee shops often have 8 oz or smaller sizes as well, especially for classics like a cappuccino or flat white.
One trick if you want to order a seasonal or flavored latte is that most coffee shops have a "1/2 sweet" option that uses half the syrup, which is usually more than sufficient to add flavor. I'll often order the smaller-sized cappuccino plus 1/2 the seasonal syrup instead of a latte, which gives me a similar experience in a smaller portion size and lower price.
Non-dairy milks at coffee shops are often full of unnecessary additives and over-sweetened, so try skim milk instead of almond/coconut milk if your goal is lower calories. Non-dairy milks are also full of empty carbs, whereas dairy milk has much more protein. For a richer drink, upgrade to whole milk and add a bit of sugar yourself if needed, instead of letting the barista pour in a huge amount of sugar-packed flavor syrup.
Give tea a try
Another great zero-calorie option is tea. Experiment with different teas, whether it's black, green, white, masala chai, or an herbal non-caffeinated tea. The only calories come from any milk or sugar you add, but try appreciating the flavor of the tea alone. If you don't like it, maybe you want to upgrade to higher-quality teas. I particularly like the herbal options from Celestial Seasonings and Twinings. Tazo, Rishi, and Stash come well-recommended as tea brands you can find at many places in the US. If you really get serious, you'll probably upgrade to loose-leaf tea from a local shop.
Overall, minimize the calories in your drinks. Water, coffee (but not mochas / frozen drinks), and tea are great options, while you should minimize smoothies, soda, and alcohol.
But what do meals actually look like?
That seems like a ton of restrictions and rules, right? How can you, or I, keep track of them all? Overall, it's about eating a wide variety and healthy ratio of colorful, minimally processed whole foods, with natural flavor and sweetness, only during meals.
Here's some examples for a day at 1500 calories (a 1000-calorie deficit):
Breakfast
I eat the same thing almost every day, aiming for a savory breakfast rather than a sweet one. The only things I change are additions to the eggs. The veggies vary, and sometimes I substitute salsa with kimchi or sriracha.
- 2 scrambled pasture-raised eggs, with 2 diced mushrooms, 1/3 diced heirloom tomato, low-sodium lentils / black beans, and sriracha
- 250g (~9 oz) Costco three-berry blend of blackberries/raspberries/blueberries (microwaved), combined with 90g (~3.5 oz) whole-milk skyr and 15g (~1 tbsp) chia seeds
- 110g (~4 oz) kefir (fermented milk)
Lunch
Every day for lunch, I'll have a side salad, a veggie plate, or an entree salad with lean protein in it. I aim for flavorful veggies that don't require any sort of dip - try your local farmer's market for better-tasting veggies than the grocery store carries.
- Big salad with Costco power greens (kale, spinach, baby chard), dressed with 1 tsp extra-virgin olive oil, vinegar, salt, and pepper
- 115g (~4 oz) pulled chicken with mustard-based BBQ sauce
- 200g (~7 oz) Stokes/Okinawan sweet potato with 1 tsp grass-fed butter
Usually my protein is chicken, canned tuna, canned salmon, or frozen, pre-cooked shrimp. On other days, I might make a chicken-salad open-faced sandwich, or the same with tuna salad. Sometimes I'll put smoked salmon on Wasa crackers, or I'll add salmon or shrimp to my salad. I'll also regularly have tacos with chicken/shrimp, corn tortillas, veggies, salsa, and skyr (instead of sour cream).
Dinner
This day, I ate out at a burger restaurant. Here's the healthier option I constructed, using Mediterranean ratios and my other guidelines:
- Crispy Brussels sprouts for a starter
- Bison burger (6 oz / 180g patty), no bun, on a bed of lettuce and tomato
- Topped with ~1 tbsp fig jam and ~15g blue cheese (I scraped off half of the blue cheese and got the fig jam on the side, so I could control the portion)
- Side of steamed broccoli with butter
Maintaining weight is just as hard as losing it
One of my biggest challenges has been making this transition into a sustainable diet, after depriving myself of many foods I enjoy for the past year. In particular, it's extremely hard to avoid eating too many desserts or snack foods with added sugar, especially when I'm toward the lower end of my target weight range.
I speculate that this is partially related to "set point" theory. My body's used to being much heavier, and it will take time for my body to realize that I'm healthy at this new level rather than trying to survive a famine, where I should try hard to eat high-calorie foods whenever I come across them. Exercise also helps in maintaining weight loss (there's a study done on police officers who continued exercise post-weight-loss vs those who didn't, and a variety of examples from the National Weight Control Registry).
Fitness is a lifelong journey
On the fitness side, I've taken an even more efficiency-optimized approach than I had before, with continued success.
I found my energy levels getting extremely low as I approached my target weight while maintaining a large calorie deficit. This prompted me to experiment with whether I could decrease my frequency and intensity of exercise, while still getting most of the results.
Dropping HIIT with no hit in results
I kept my daily walks for low-intensity steady state (LISS) cardio exercise, although I've adapted them slightly into 3 per day - with a 15-minute walk after each of my 3 meals. However, I experimented with dropping the high-intensity interval training (HIIT).
Surprisingly, my VO2Max (a measure of cardiorespiratory health) continued to increase at almost the same rate as before. My plan is to watch for a plateau in VO2Max, and consider re-introducing HIIT at that point. Alternately, if I ever get too short on time to continue with enough LISS, I could replace it entirely with my extremely low-volume HIIT program.
I would like to re-add HIIT at some point because a mixture of different intensities is overall better than just one. However I frankly don't enjoy HIIT so I'm not in a big rush, until I have a clear need (like I mentioned above).
Simplifying and reducing strength training
I was also doing strength training 3x/week, with 3 paired sets per workout. I've replaced that with a 2x/week pattern, also dropping from 3 to only 1 paired set - importantly, performed to near-failure. Again, I've seen nearly equivalent results. Upon reviewing the academic research and expert recommendations in this area, many experts suggest that sets 2 and 3 essentially serve as "insurance" that you've maximized your potential growth in strength & size during a workout. At worst, doing a single set might offer more than 50% of the total benefit of any number of sets. That means a single set - if done well - could provide a majority of the benefits in just 1/3 of the time. This fits nicely into my 80/20 philosophy.
If you'd like to look into this in more detail, go on Google Scholar and look up "resistance training single-set OR one-set review OR meta-analysis." In general, the research shows a dose-dependent response (more sets produce better results), but there's diminishing returns from each additional set. You need to carefully look at the effect sizes, comparing the effect size of one set to the effect size of multiple sets. There will often be statistically significant differences, but the effect size is the important part. It's not about whether the difference is real, it's about how big it is. Overall, if you're optimizing for efficiency on time spent working out rather than maximizing muscle growth in a certain period of time (e.g. a year), single sets can be a great approach. My perspective is that I'll be doing this for the rest of my life, and I'll be moving increasingly slowly toward a plateau of my biological maximum strength, so I don't really care how many years it takes.
I may find that I need to increase my set count as I get more experience with strength training, and my "newbie gains" gradually fade away. We'll see how things continue to develop over time, and whether I hit a plateau where that might be an option I try.
My current strength training routine continues to use a similar routine as described in my last writeup. I use the 8×3 app to track my progressions & progressive overload, and I alternate between two routines, both of which are full-body workouts with compound movements:
Day 1: Vertical push/pull (+core & legs). L-sit pull-ups, dips / handstand push-ups, squats, Nordic curls.
Day 2: Horizontal push/pull (+core & legs). Horizontal rows, push-ups, squats, Nordic curls, hanging leg raises.
Each exercise is part of a progression toward more advanced, lower-leverage movements that will continue to build strength, without the need to use any weights. For example, I'm specifically working on pistol & shrimp squats, handstand push-up negatives, pseudo planche push-ups, L-sit pull-ups, and tucked front levers.
I've added two more low-cost, small, and portable pieces of equipment to make this easy, bringing my total to three pieces. I'd already purchased a doorway pull-up bar ($26). Since then, I've added gymnastics rings ($32) hanging from the pull-up bar. Rings are extremely flexible - I use them for horizontal rows and dips, but they can be used for ab roll-outs, pull-ups (instead of the bar), and so much more. I'm also using a Nordstick ($27, or a bit more for the Pro) that slides under a closet door, because Nordic curls are tricky without some sort of specialized device. An alternative, equipment-free exercise is a reverse hyperextension, but the unweighted version will plateau pretty quickly.
Overall, I've further reduced the time commitment from exercise without significant impact. I've removed HIIT, maintained LISS (daily, 15 min x 3), and reduced strength training (2x/wk, 10 min x 1), and I still see nearly equivalent outcomes.
I'm not just maintaining my fitness and strength - it's continuing to grow, even without any caloric surplus. I do expect that re-composition to plateau within a year or two at a maintenance diet. At that point, I may need to do mini bulks and cuts (gaining/losing weight in cycles to grow my muscle mass).
Learn more
Want to learn more? Here's some books that I've found helpful, roughly in order. I've also shared my Kindle highlights for each one, in case you want to see my perspective on the key points before reading the full book.
- Ultra-Processed People (my Kindle highlights)
- Metabolical: The Lure and the Lies of Processed Food, Nutrition, and Modern Medicine (my Kindle highlights)
- Glucose Revolution: The Life-Changing Power of Balancing Your Blood Sugar (my Kindle highlights)
- Salt Sugar Fat: How the Food Giants Hooked Us (my Kindle highlights)
- Sugarless: A 7-Step Plan to Uncover Hidden Sugars, Curb Your Cravings, and Conquer Your Addiction (my Kindle highlights)
- The No S Diet: The Strikingly Simple Weight-Loss Strategy That Has Dieters Raving-and Dropping Pounds (my Kindle highlights)
- Ravenous: How to get ourselves and our planet into shape (my Kindle highlights)
- The Way We Eat Now: How the Food Revolution Has Transformed Our Lives, Our Bodies, and Our World (my Kindle highlights)
- Spoon-Fed: Why Almost Everything We've Been Told About Food is Wrong (my Kindle highlights)
- Food for Life: The New Science of Eating Well (my Kindle highlights)
- The Dorito Effect (my Kindle highlights)
- The End of Craving: Recovering the Lost Wisdom of Eating Well (my Kindle highlights)
- Lose It Forever: The 6 Habits of Successful Weight Losers from the National Weight Control Registry (my Kindle highlights)
17 Dec 2024 9:05pm GMT
16 Dec 2024
planet.freedesktop.org
Lennart Poettering: Announcing systemd v257
Last week we released systemd v257 into the wild.
In the weeks leading up to this release (and the week after) I have posted a series of serieses of posts to Mastodon about key new features in this release, under the #systemd257 hash tag. In case you aren't using Mastodon, but would like to read up, here's a list of all 37 posts:
- Post #1: Fully Locked Accounts with systemd-sysusers
- Post #2: Combined Signed PCR and Locally Managed PCR Policies for Disk Encryption
- Post #3: Progress Indication via Terminal ANSI Sequence
- Post #4: Multi-Profile UKIs
- Post #5: The New sd-varlink & sd-json APIs in libsystemd
- Post #6: Querying for Passwords in User Scope
- Post #7: Secure Attention Key Logic in systemd-logind
- Post #8: systemd-nspawn --bind-user= Now Copies User's SSH Key
- Post #9: The New DeferReactivation= Switch in .timer Units
- Post #10: Support for the New IPE LSM
- Post #11: Environment Variables for Shell Prompt Prefix/Suffix
- Post #12: sysctl Conflict Detection via eBPF
- Post #13: initrd and µcode UKI Add-Ons
- Post #14: SecureBoot Signing with the New systemd-sbsign Tool
- Post #15: Managed Access to hidraw devices in systemd-logind
- Post #16: Fuzzy Filtering in userdbctl
- Post #17: MAC Address Based Alternative Network Interface Names
- Post #18: Conditional Copying/Symlinking in tmpfiles.d/
- Post #19: Automatic Service Restarts in Debug Mode
- Post #20: Filtering by Invocation ID in journalctl
- Post #21: Supplement Partitions in repart.d/
- Post #22: DeviceTree Matching in UKIs
- Post #23: The New ssh-exec: Protocol in varlinkctl
- Post #24: SecureBoot Key Enrollment Preparation with bootctl
- Post #25: Automatically Installing confext/sysext/portable/VMs/container Images at Boot
- Post #26: Designated Maintenance Time in systemd-logind
- Post #27: PID Namespacing in Service Management
- Post #28: Marking Experimental OS Releases in /etc/os-release
- Post #29: Decoding Capability Masks with systemd-analyze
- Post #30: Investigating Passed SMBIOS Type #11 Data
- Post #31: Initializing Partitions from Character Devices in repart.d/
- Post #32: Entering Namespaces to Generate Stacktraces
- Post #33: ID Mapped Mounts for Per-Service Directories
- Post #34: A Daemon for systemd-sysupdate
- Post #35: User Record Modifications without Administrator Consent in systemd-homed
- Post #36: DNR DHCP Support
- Post #37: Name Based AF_VSOCK ssh Access
I intend to do a similar series of serieses of posts for the next systemd release (v258), hence if you haven't left tech Twitter for Mastodon yet, now is the opportunity.
16 Dec 2024 11:00pm GMT
14 Dec 2024
planet.freedesktop.org
Simon Ser: Status update, December 2024
Hi!
For once let's open things up with the NPotM. I've started working on sajin, an Android app which synchronizes camera pictures in the background. I've grown tired of manually copying files around, and I don't want to use proprietary services to backup my pictures, so I've been meaning to write a tiny app to upload pictures to my server. It's super simple: enter the WebDAV server URL and credentials, then just forget about the app. It plays well with sogogi (my WebDAV file server) and Photoview (a Web picture gallery). I'd like to implement feedback on synchronization status and manual synchronization of older pictures. I really need to find an icon for it too.
Once again, this month I've spent a fair bit of time on Sway and wlroots bug fixes, in particular wlroots DRM backend issues affecting old GPUs (these not supporting the atomic KMS API) and multi-GPU setups (I've had to bite the bullet and bring my super shaky setup out of the closet). wlroots 0.18.2 has been released, among other things it also fixes some X11 drag-and-drop bugs (thanks Consolatis!).
In IRC land, delthas has added soju support for the metadata extension, enabling clients to mark conversations as pinned or muted. Once senpai and Goguma add support for this extension, they will be able to synchronize this bit of state. In other words, marking a conversation as pinned on a mobile phone will also affect all other connected clients.
Thanks to John Regan, PostgreSQL message queries have been optimized by several orders of magnitude: on large message stores, they now take a few milliseconds instead of multiple seconds. I've turned on WAL mode for SQLite, which should help with message insertion performance.
I've worked on making Goguma play better with direct connections to old IRC servers such as Libera Chat and OFTC. These servers support only a few IRCv3 extensions, and they aggressively rate-limit TCP connections and commands (including CAP REQ
commands sent to initialize the connection). Goguma should now reconnect less often on first setup and should connect more quickly (by reducing the amount of CAP REQ
commands).
Last, I've added proper support for GitLab Pages to dalligi, a small bridge to use builds.sr.ht as a GitLab CI runner. GitLab Pages requires to define a special job with the exact name "pages", which is cumbersome with builds.sr.ht. dalligi can now copy over artifacts of a previous job to this special "pages" job. I hope this can be used to automatically publish wlroots docs.
See you next year!
14 Dec 2024 10:00pm GMT
12 Dec 2024
planet.freedesktop.org
Hans de Goede: IPU6 camera support is broken in kernel 6.11.11 / 6.12.2-6.12.4
Unfortunately an incomplete backport of IPU6 DMA handling changes has landed in kernel 6.11.11.
This not only causes IPU6 cameras to not work, this causes the kernel to (often?) crash on boot on systems where the IPU6 is in use and thus enabled by the BIOS.
Kernels 6.12.2 - 6.12.4 are also affected by this. A fix for this is pending for the upcoming 6.12.5 release.
6.11.11 is the last stable release in the 6.11.y series, so there will be no new stable 6.11.y release with a fix.
As a workaround users affected by this can stay with 6.11.10 or 6.12.1 until 6.12.5 is available in your distributions updates(-testing) repository.
comments
12 Dec 2024 1:52pm GMT
02 Dec 2024
planet.freedesktop.org
Alyssa Rosenzweig: Vulkan 1.4 sur Asahi Linux
English version follows.
Aujourd'hui, Khronos Group a sorti la spécification 1.4 de l'API graphique standard Vulkan. Le projet Asahi Linux est fier d'annoncer le premier pilote Vulkan 1.4 pour le matériel d'Apple. En effet, notre pilote graphique Honeykrisp est reconnu par Khronos comme conforme à cette nouvelle version dès aujourd'hui.
Ce pilote est déjà disponible dans nos dépôts officiels. Après avoir installé Fedora Asahi Remix, executez dnf upgrade --refresh
pour obtenir la dernière version du pilote.
Vulkan 1.4 standardise plusieurs fonctionnalités importantes, y compris les horodatages et la lecture locale avec le rendu dynamique. L'industrie suppose que ces fonctionnalités devront être plus courantes, et nous y sommes préparés.
Sortir un pilote conforme reflète notre engagement en faveur des standards graphiques et du logiciel libre. Asahi Linux est aussi compatible avec OpenGL 4.6, OpenGL ES 3.2, et OpenCL 3.0, tous conformes aux spécifications pertinentes. D'ailleurs, les nôtres sont les seuls pilotes conformes pour le materiel d'Apple de n'importe quel standard graphique.
Même si le pilote est sorti, il faut encore compiler une version expérimentale de Vulkan-Loader pour utiliser la nouvelle version de Vulkan. Toutes les nouvelles fonctionnalités sont néanmoins disponibles comme extensions à notre pilote Vulkan 1.3 pour en profiter tout de suite.
Pour plus d'informations, consultez l'article du blog de Khronos.
Today, the Khronos Group released the 1.4 specification of Vulkan, the standard graphics API. The Asahi Linux project is proud to announce the first Vulkan 1.4 driver for Apple hardware. Our Honeykrisp driver is Khronos-recognized as conformant to the new version since day one.
That driver is already available in our official repositories. After installing Fedora Asahi Remix, run dnf upgrade --refresh
to get the latest drivers.
Vulkan 1.4 standardizes several important features, including timestamps and dynamic rendering local read. The industry expects that these features will become more common, and we are prepared.
Releasing a conformant driver reflects our commitment to graphics standards and software freedom. Asahi Linux is also compatible with OpenGL 4.6, OpenGL ES 3.2, and OpenCL 3.0, all conformant to the relevant specifications. For that matter, ours are the only conformant drivers on Apple hardware for any graphics standard.
Although the driver is released, you still need to build an experimental version of Vulkan-Loader to access the new Vulkan version. Nevertheless, you can immediately use all the new features as extensions in our Vulkan 1.3 driver.
For more information, see the Khronos blog post.
02 Dec 2024 5:00am GMT
21 Nov 2024
planet.freedesktop.org
Simon Ser: Status update, November 2024
Hi all!
This month I've spent a lot of time triaging Sway and wlroots issues following the Sway 1.10 release. There are a few regressions, some of which are already fixed (thanks to all contributors for sending patches!). Kenny has added support for software-only secondary KMS devices such as GUD and DisplayLink. David Turner from Raspberry Pi has contributed crop and scale support for output buffers, that way video players are more likely to hit direct scan-out. I've added support for explicit sync in the Wayland backend for nested compositors.
I've worked a bit on the Goguma mobile IRC client. The auto-complete dropdown now shows user display names, channel topics and command descriptions. Additionally, commands which don't make sense given the current context are hidden (for instance, /part
is not displayed in a conversation with a single user).
The gamja Web IRC client should now reconnect more quickly after regaining connectivity. For instance, after resume from suspend, gamja now reconnects immediately instead of waiting 10 seconds. Thanks to Matteo, soju-containers now ships arm64 images.
The NPotM is sogogi, a simple WebDAV file server. It's quite minimal for now: a list of directories to serve is defined in the configuration file, as well as users and access lists. In the future, I'd like to add external authentication (e.g. via PAM or via another HTTP server), HTML directory listings and configuration file reload.
That's all for now! Once again, that's a pretty short status update. A lot of my time goes into more boring maintenance tasks and reviews. See you next month!
21 Nov 2024 10:00pm GMT
19 Nov 2024
planet.freedesktop.org
Melissa Wen: Display/KMS Meeting at XDC 2024: Detailed Report
XDC 2024 in Montreal was another fantastic gathering for the Linux Graphics community. It was again a great time to immerse in the world of graphics development, engage in stimulating conversations, and learn from inspiring developers.
Many Igalia colleagues and I participated in the conference again, delivering multiple talks about our work on the Linux Graphics stack and also organizing the Display/KMS meeting. This blog post is a detailed report on the Display/KMS meeting held during this XDC edition.
Short on Time?
- Catch the lightning talk summarizing the meeting here (you can even speed up 2x):
- For a quick written summary, scroll down to the TL;DR section.
TL;DR
This meeting took 3 hours and tackled a variety of topics related to DRM/KMS (Linux/DRM Kernel Modesetting):
- Sharing Drivers Between V4L2 and KMS: Brainstorming solutions for using a single driver for devices used in both camera capture and display pipelines.
- Real-Time Scheduling: Addressing issues with non-blocking page flips encountering sigkills under real-time scheduling.
- HDR/Color Management: Agreement on merging the current proposal, with NVIDIA implementing its special cases on VKMS and adding missing parts on top of Harry Wentland's (AMD) changes.
- Display Mux: Collaborative design discussions focusing on compositor control and cross-sync considerations.
- Better Commit Failure Feedback: Exploring ways to equip compositors with more detailed information for failure analysis.
Bringing together Linux display developers in the XDC 2024
While I didn't present a talk this year, I co-organized a Display/KMS meeting (with Rodrigo Siqueira of AMD) to build upon the momentum from the 2024 Linux Display Next hackfest. The meeting was attended by around 30 people in person and 4 remote participants.
Speakers: Melissa Wen (Igalia) and Rodrigo Siqueira (AMD)
Link: https://indico.freedesktop.org/event/6/contributions/383/
Topics: Similar to the hackfest, the meeting agenda was built over the first two days of the conference and mixed talks follow-up with new ideas and ongoing community efforts.
The final agenda covered five topics in the scheduled order:
- How to share drivers between V4L2 and DRM for bridge-like components (new topic);
- Real-time Scheduling (problems encountered after the Display Next hackfest);
- HDR/Color Management (ofc);
- Display Mux (from Display hackfest and XDC 2024 talk, bringing AMD and NVIDIA together);
- (Better) Commit Failure Feedback (continuing the last minute topic of the Display Next hackfest).
Unpacking the Topics
Similar to the hackfest, the meeting agenda evolved over the conference. During the 3 hours of meeting, I coordinated the room and discussion rounds, and Rodrigo Siqueira took notes and also contacted key developers to provide a detailed report of the many topics discussed.
From his notes, let's dive into the key discussions!
How to share drivers between V4L2 and KMS for bridge-like components.
Led by Laurent Pinchart, we delved into the challenge of creating a unified driver for hardware devices (like scalers) that are used in both camera capture pipelines and display pipelines.
- Problem Statement: How can we design a single kernel driver to handle devices that serve dual purposes in both V4L2 and DRM subsystems?
- Potential Solutions:
- Multiple Compatible Strings: We could assign different compatible strings to the device tree node based on its usage in either the camera or display pipeline. However, this approach might raise concerns from device tree maintainers as it could be seen as a layer violation.
- Separate Abstractions: A single driver could expose the device to both DRM and V4L2 through separate abstractions: drm-bridge for DRM and V4L2 subdev for video. While simple, this approach requires maintaining two different abstractions for the same underlying device.
- Unified Kernel Abstraction: We could create a new, unified kernel abstraction that combines the best aspects of drm-bridge and V4L2 subdev. This approach offers a more elegant solution but requires significant design effort and potential migration challenges for existing hardware.
Real-Time Scheduling Challenges
We have discussed real-time scheduling during this year Linux Display Next hackfest and, during the XDC 2024, Jonas Adahl brought up issues uncovered while progressing on this front.
- Context: Non-blocking page-flips can, on rare occasions, take a long time and, for that reason, get a sigkill if the thread doing the atomic commit is a real-time schedule.
- Action items:
- Explore alternative backtraces during the busy wait (e.g., ftrace).
- Investigate the maximum thread time in busy wait to reproduce issues faced by compositors. Tools like RTKit (mutter) can be used for better control (Michel Dänzer can help with this setup).
HDR/Color Management
This is a well-known topic with ongoing effort on all layers of the Linux Display stack and has been discussed online and in-person in conferences and meetings over the last years.
Here's a breakdown of the key points raised at this meeting:
- Talk: Color operations for Linux color pipeline on AMD devices: In the previous day, Alex Hung (AMD) presented the implementation of this API on AMD display driver.
- NVIDIA Integration: While they agree with the overall proposal, NVIDIA needs to add some missing parts. Importantly, they will implement these on top of Harry Wentland's (AMD) proposal. Their specific requirements will be implemented on VKMS (Virtual Kernel Mode Setting driver) for further discussion. This VKMS implementation can benefit compositor developers by providing insights into NVIDIA's specific needs.
- Other vendors: There is a version of the KMS API applied on Intel color pipeline. Apart from that, other vendors appear to be comfortable with the current proposal but lacks the bandwidth to implement it right now.
- Upstream Patches: The relevant upstream patches were can be found here. [As humorously notes, this series is eagerly awaiting your "Acked-by" (approval)]
- Compositor Side: The compositor developers have also made significant progress.
- KDE has already implemented and validated the API through an experimental implementation in Kwin.
- Gamescope currently uses a driver-specific implementation but has a draft that utilizes the generic version. However, some work is still required to fully transition away from the driver-specific approach. AP: work on porting gamescope to KMS generic API
- Weston has also begun exploring implementation, and we might see something from them by the end of the year.
- Kernel and Testing: The kernel API proposal is well-refined and meets the DRM subsystem requirements. Thanks to Harry Wentland effort, we already have the API attached to two hardware vendors and IGT tests, and, thanks to Xaver Hugl, a compositor implementation in place.
Finally, there was a strong sense of agreement that the current proposal for HDR/Color Management is ready to be merged. In simpler terms, everything seems to be working well on the technical side - all signs point to merging and "shipping" the DRM/KMS plane color management API!
Display Mux
During the meeting, Daniel Dadap led a brainstorming session on the design of the display mux switching sequence, in which the compositor would arm the switch via sysfs, then send a modeset to the outgoing driver, followed by a modeset to the incoming driver.
- Context:
- During this year Linux Display Next hackfest, Mario Limonciello (AMD) introduced the topic and led a discussion on Display Mux.
- Daniel Dadap (NVIDIA) retook this discussion with the XDC 2024 talk: Dynamic Switching of Display Muxes on Hybrid GPU Systems.
- Key Considerations:
- HPD Handling: There was a general consensus that disabling HPD can be part of the sequence for internal panels and we don't need to focus on it here.
- Cross-Sync: Ensuring synchronization between the compositor and the drivers is crucial. The compositor should act as the "drm-master" to coordinate the entire sequence, but how can this be ensured?
- Future-Proofing: The design should not assume the presence of a mux. In future scenarios, direct sharing over DP might be possible.
- Action points:
- Sharing DP AUX: Explore the idea of sharing DP AUX and its implications.
- Backlight: The backlight definition represents a problem in the mux switch context, so we should explore some of the current specs available for that.
Towards Better Commit Failure Feedback
In the last part of the meeting, Xaver Hugl asked for better commit failure feedback.
- Problem description: Compositors currently face challenges in collecting detailed information from the kernel about commit failures. This lack of granular data hinders their ability to understand and address the root causes of these failures.
To address this issue, we discussed several potential improvements:
- Direct Kernel Log Access: One idea is to directly load relevant kernel logs into the compositor. This would provide more detailed information about the failure and potentially aid in debugging.
- Finer-Grained Failure Reporting: We also explored the possibility of separating atomic failures into more specific categories. Not all failures are critical, and understanding the nature of the failure can help compositors take appropriate action.
- Enhanced Logging: Currently, the dmesg log doesn't provide enough information for user-space validation. Raising the log level to capture more detailed information during failures could be a viable solution.
By implementing these improvements, we aim to equip compositors with the necessary tools to better understand and resolve commit failures, leading to a more robust and stable display system.
A Big Thank You!
Huge thanks to Rodrigo Siqueira for these detailed meeting notes. Also, Laurent Pinchart, Jonas Adahl, Daniel Dadap, Xaver Hugl, and Harry Wentland for bringing up interesting topics and leading discussions. Finally, thanks to all the participants who enriched the discussions with their experience, ideas, and inputs, especially Alex Goins, Antonino Maniscalco, Austin Shafer, Daniel Stone, Demi Obenour, Jessica Zhang, Joan Torres, Leo Li, Liviu Dudau, Mario Limonciello, Michel Dänzer, Rob Clark, Simon Ser and Teddy Li.
This collaborative effort will undoubtedly contribute to the continued development of the Linux display stack.
Stay tuned for future updates!
19 Nov 2024 1:00pm GMT
Peter Hutterer: hidreport and hut: two crates for handling HID Report Descriptors and HID Reports
A while ago I was looking at Rust-based parsing of HID reports but, surprisingly, outside of C wrappers and the usual cratesquatting I couldn't find anything ready to use. So I figured, why not write my own, NIH style. Yay! Gave me a good excuse to learn API design for Rust and whatnot. Anyway, the result of this effort is the hidutils collection of repositories which includes commandline tools like hid-recorder and hid-replay but, more importantly, the hidreport (documentation) and hut (documentation) crates. Let's have a look at the latter two.
Both crates were intentionally written with minimal dependencies, they currently only depend on thiserror and arguably even that dependency can be removed.
HID Usage Tables (HUT)
As you know, HID Fields have a so-called "Usage" which is divided into a Usage Page (like a chapter) and a Usage ID. The HID Usage tells us what a sequence of bits in a HID Report represents, e.g. "this is the X axis" or "this is button number 5". These usages are specified in the HID Usage Tables (HUT) (currently at version 1.5 (PDF)). The hut crate is generated from the official HUT json file and contains all current HID Usages together with the various conversions you will need to get from a numeric value in a report descriptor to the named usage and vice versa. Which means you can do things like this:
let gd_x = GenericDesktop::X; let usage_page = gd_x.usage_page(); assert!(matches!(usage_page, UsagePage::GenericDesktop));
Or the more likely need: convert from a numeric page/id tuple to a named usage.
let usage = Usage::new_from_page_and_id(0x1, 0x30); // GenericDesktop / X println!("Usage is {}", usage.name());
90% of this crate are the various conversions from a named usage to the numeric value and vice versa. It's a huge crate in that there are lots of enum values but the actual functionality is relatively simple.
hidreport - Report Descriptor parsing
The hidreport crate is the one that can take a set of HID Report Descriptor bytes obtained from a device and parse the contents. Or extract the value of a HID Field from a HID Report, given the HID Report Descriptor. So let's assume we have a bunch of bytes that are HID report descriptor read from the device (or sysfs) we can do this:
let rdesc: ReportDescriptor = ReportDescriptor::try_from(bytes).unwrap();
I'm not going to copy/paste the code to run through this report descriptor but suffice to day it will give us access to the input, output and feature reports on the device together with every field inside those reports. Now let's read from the device and parse the data for whatever the first field is in the report (this is obviously device-specific, could be a button, a coordinate, anything):
let input_report_bytes = read_from_device(); let report = rdesc.find_input_report(&input_report_bytes).unwrap(); let field = report.fields().first().unwrap(); match field { Field::Variable(var) => { let val: u32 = var.extract(&input_report_bytes).unwrap().into(); println!("Field {:?} is of value {}", field, val); }, _ => {} }
The full documentation is of course on docs.rs and I'd be happy to take suggestions on how to improve the API and/or add features not currently present.
hid-recorder
The hidreport and hut crates are still quite new but we have an existing test bed that we use regularly. The venerable hid-recorder tool has been rewritten twice already. Benjamin Tissoires' first version was in C, then a Python version of it became part of hid-tools and now we have the third version written in Rust. Which has a few nice features over the Python version and we're using it heavily for e.g. udev-hid-bpf debugging and development. An examle output of that is below and it shows that you can get all the information out of the device via the hidreport and hut crates.
$ sudo hid-recorder /dev/hidraw1 # Microsoft Microsoft® 2.4GHz Transceiver v9.0 # Report descriptor length: 223 bytes # 0x05, 0x01, // Usage Page (Generic Desktop) 0 # 0x09, 0x02, // Usage (Mouse) 2 # 0xa1, 0x01, // Collection (Application) 4 # 0x05, 0x01, // Usage Page (Generic Desktop) 6 # 0x09, 0x02, // Usage (Mouse) 8 # 0xa1, 0x02, // Collection (Logical) 10 # 0x85, 0x1a, // Report ID (26) 12 # 0x09, 0x01, // Usage (Pointer) 14 # 0xa1, 0x00, // Collection (Physical) 16 # 0x05, 0x09, // Usage Page (Button) 18 # 0x19, 0x01, // UsageMinimum (1) 20 # 0x29, 0x05, // UsageMaximum (5) 22 # 0x95, 0x05, // Report Count (5) 24 # 0x75, 0x01, // Report Size (1) 26 ... omitted for brevity # 0x75, 0x01, // Report Size (1) 213 # 0xb1, 0x02, // Feature (Data,Var,Abs) 215 # 0x75, 0x03, // Report Size (3) 217 # 0xb1, 0x01, // Feature (Cnst,Arr,Abs) 219 # 0xc0, // End Collection 221 # 0xc0, // End Collection 222 R: 223 05 01 09 02 a1 01 05 01 09 02 a1 02 85 1a 09 ... omitted for previty N: Microsoft Microsoft® 2.4GHz Transceiver v9.0 I: 3 45e 7a5 # Report descriptor: # ------- Input Report ------- # Report ID: 26 # Report size: 80 bits # | Bit: 8 | Usage: 0009/0001: Button / Button 1 | Logical Range: 0..=1 | # | Bit: 9 | Usage: 0009/0002: Button / Button 2 | Logical Range: 0..=1 | # | Bit: 10 | Usage: 0009/0003: Button / Button 3 | Logical Range: 0..=1 | # | Bit: 11 | Usage: 0009/0004: Button / Button 4 | Logical Range: 0..=1 | # | Bit: 12 | Usage: 0009/0005: Button / Button 5 | Logical Range: 0..=1 | # | Bits: 13..=15 | ######### Padding | # | Bits: 16..=31 | Usage: 0001/0030: Generic Desktop / X | Logical Range: -32767..=32767 | # | Bits: 32..=47 | Usage: 0001/0031: Generic Desktop / Y | Logical Range: -32767..=32767 | # | Bits: 48..=63 | Usage: 0001/0038: Generic Desktop / Wheel | Logical Range: -32767..=32767 | Physical Range: 0..=0 | # | Bits: 64..=79 | Usage: 000c/0238: Consumer / AC Pan | Logical Range: -32767..=32767 | Physical Range: 0..=0 | # ------- Input Report ------- # Report ID: 31 # Report size: 24 bits # | Bits: 8..=23 | Usage: 000c/0238: Consumer / AC Pan | Logical Range: -32767..=32767 | Physical Range: 0..=0 | # ------- Feature Report ------- # Report ID: 18 # Report size: 16 bits # | Bits: 8..=9 | Usage: 0001/0048: Generic Desktop / Resolution Multiplier | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bits: 10..=11 | Usage: 0001/0048: Generic Desktop / Resolution Multiplier | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bits: 12..=15 | ######### Padding | # ------- Feature Report ------- # Report ID: 23 # Report size: 16 bits # | Bits: 8..=9 | Usage: ff00/ff06: Vendor Defined Page 0xFF00 / Vendor Usage 0xff06 | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bits: 10..=11 | Usage: ff00/ff0f: Vendor Defined Page 0xFF00 / Vendor Usage 0xff0f | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bit: 12 | Usage: ff00/ff04: Vendor Defined Page 0xFF00 / Vendor Usage 0xff04 | Logical Range: 0..=1 | Physical Range: 0..=0 | # | Bits: 13..=15 | ######### Padding | ############################################################################## # Recorded events below in format: # E: . [bytes ...] # # Current time: 11:31:20 # Report ID: 26 / # Button 1: 0 | Button 2: 0 | Button 3: 0 | Button 4: 0 | Button 5: 0 | X: 5 | Y: 0 | # Wheel: 0 | # AC Pan: 0 | E: 000000.000124 10 1a 00 05 00 00 00 00 00 00 00
19 Nov 2024 1:54am GMT
18 Nov 2024
planet.freedesktop.org
Ricardo Garcia: My XDC 2024 talk about VK_EXT_device_generated_commands
Some days ago I wrote about the new VK_EXT_device_generated_commands Vulkan extension that had just been made public. Soon after that, I presented a talk at XDC 2024 with a brief introduction to it. It's a lightning talk that lasts just about 7 minutes and you can find the embedded video below, as well as the slides and the talk transcription if you prefer written formats.
Truth be told, the topic deserves a longer presentation, for sure. However, when I submitted my talk proposal for XDC I wasn't sure if the extension was going to be public by the time XDC would take place. This meant I had two options: if I submitted a half-slot talk and the extension was not public, I needed to talk for 15 minutes about some general concepts and a couple of NVIDIA vendor-specific extensions: VK_NV_device_generated_commands and VK_NV_device_generated_commands_compute. That would be awkward so I went with a lighning talk where I could talk about those general concepts and, maybe, talk about some VK_EXT_device_generated_commands specifics if the extension was public, which is exactly what happened.
Fortunately, I will talk again about the extension at Vulkanised 2025. It will be a longer talk and I will cover the topic in more depth. See you in Cambridge in February and, for those not attending, stay tuned because Vulkanised talks are recorded and later uploaded to YouTube. I'll post the link here and in social media once it's available.
XDC 2024 recording
Talk slides and transcription
Hello, I'm Ricardo from Igalia and I'm going to talk about Device-Generated Commands in Vulkan. This is a new extension that was released a couple of weeks ago. I wrote CTS tests for it, I helped with the spec and I worked with some actual heros, some of them present in this room, that managed to get this implemented in a driver.
Device-Generated Commands is an extension that allows apps to go one step further in GPU-driven rendering because it makes it possible to write commands to a storage buffer from the GPU and later execute the contents of the buffer without needing to go through the CPU to record those commands, like you typically do by calling vkCmd functions working with regular command buffers.
It's one step ahead of indirect draws and dispatches, and one step behind work graphs.
Getting away from Vulkan momentarily, if you want to store commands in a storage buffer there are many possible ways to do it. A naïve approach we can think of is creating the buffer as you see in the slide. We assign a number to each Vulkan command and store it in the buffer. Then, depending on the command, more or less data follows. For example, lets take the sequence of commands in the slide: (1) push constants followed by (2) dispatch. We can store a token number or command id or however you want to call it to indicate push constants, then we follow with meta-data about the command (which is the section in green color) containing the layout, stage flags, offset and size of the push contants. Finally, depending on the size, we store the push constant values, which is the first chunk of data in blue. For the dispatch it's similar, only that it doesn't need metadata because we only want the dispatch dimensions.
But this is not how GPUs work. A GPU would have a very hard time processing this. Also, Vulkan doesn't work like this either. We want to make it possible to process things in parallel and provide as much information in advance as possible to the driver.
So in Vulkan things are different. The buffer will not contain an arbitrary sequence of commands where you don't know which one comes next. What we do is to create an Indirect Commands Layout. This is the main concept. The layout is like a template for a short sequence of commands. We create this layout using the tokens and meta-data that we saw colored red and green in the previous slide.
We specify the layout we will use in advance and, in the buffer, we ony store the actual data for each command. The result is that the buffer containing commands (lets call it the DGC buffer) is divided into small chunks, called sequences in the spec, and the buffer can contain many such sequences, but all of them follow the layout we specified in advance.
In the example, we have push constant values of a known size followed by the dispatch dimensions. Push constant values, dispatch. Push constant values, dispatch. Etc.
The second thing Vulkan does is to severely limit the selection of available commands. You can't just start render passes or bind descriptor sets or do anything you can do in a regular command buffer. You can only do a few things, and they're all in this slide. There's general stuff like push contants, stuff related to graphics like draw commands and binding vertex and index buffers, and stuff to dispatch compute or ray tracing work. That's it.
Moreover, each layout must have one token that dispatches work (draw, compute, trace rays) but you can only have one and it must be the last one in the layout.
Something that's optional (not every implementation is going to support this) is being able to switch pipelines or shaders on the fly for each sequence.
Summing up, in implementations that allow you to do it, you have to create something new called Indirect Execution Sets, which are groups or arrays of pipelines that are more or less identical in state and, basically, only differ in the shaders they include.
Inside each set, each pipeline gets an index and you can change the pipeline used for each sequence by (1) specifying the Execution Set in advance (2) using an execution set token in the layout, and (3) storing a pipeline index in the DGC buffer as the token data.
The summary of how to use it would be:
First, create the commands layout and, optionally, create the indirect execution set if you'll switch pipelines and the driver supports that.
Then, get a rough idea of the maximum number of sequences that you'll run in a single batch.
With that, create the DGC buffer, query the required preprocess buffer size, which is an auxiliar buffer used by some implementations, and allocate both.
Then, you record the regular command buffer normally and specify the state you'll use for DGC. This also includes some commands that dispatch work that fills the DGC buffer somehow.
Finally, you dispatch indirect work by calling vkCmdExecuteGeneratedCommandsEXT. Note you need a barrier to synchronize previous writes to the DGC buffer with reads from it.
You can also do explicit preprocessing but I won't go into detail here.
That's it. Thank for watching, thanks Valve for funding a big chunk of the work involved in shipping this, and thanks to everyone who contributed!
18 Nov 2024 3:55pm GMT
16 Nov 2024
planet.freedesktop.org
Tomeu Vizoso: Etnaviv NPU update 21: Support for the NPU in the NXP i.MX 8M Plus SoC is upstream!
Several months have passed since the last update. This has been in part due to the summer holidays and a gig doing some non-upstream work, but I have also had the opportunity to continue my work on the NPU driver for the VeriSilicon NPU in the NXP i.MX 8M Plus SoC, thanks to my friends at Ideas on Board.
CC BY-NC 4.0 Henrik Boye |
I'm very happy with what has been accomplished so far, with the first concrete result being the merge in Mesa of the support for NXP's SoC. Thanks to Philipp Zabel and Christian Gmeiner for helping with their ideas and code reviews.
With this, as of yesterday, one can accelerate models such as SSDLite MobileDet on that SoC with only open source software, with the support being provided directly from projects that are already ubiquitous in today's products, such as the Linux kernel and Mesa3D. We can expect this functionality to reach distributions such as Debian in due time, for seamless installation and integration in products.
With this milestone reached, I will be working on expanding support for more models, with a first goal of enabling YOLO-like models, starting with YOLOX. I will be working as well on performance, as currently we are not fully using the capabilities of this hardware.
16 Nov 2024 9:27am GMT
30 Oct 2024
planet.freedesktop.org
Christian Gmeiner: CI-Tron: A Long Road to a Better Board Farm
I'm a big supporter of finding problems before they get into the code base. The earlier you catch issues, the easier they are to fix. One of the main tools that helps with this is a Continuous Integration (CI) farm. A CI farm allows you to run extensive tests like deqp or piglit on a merge request or even on a private git branch before any code is merged, which significantly helps catch problems early.
I'm not the first one at Igalia to think this is really important. We already have a large Raspberry Pi board farm available on freedesktop's GitLab instance that serves as a powerful tool for validating changes before they hit the main branch.
For a while, however, the etnaviv board farm has been offline. The main reason? I needed to clean up the setup: re-house it in a proper rack, redo all the wiring, and add more devices. What initially seemed like a few days' worth of work spiraled into months of delay, mostly because I wanted to transition to using ci-tron.
Getting Familiar with the Ci-Tron Setup
Before diving into my journey, let's quickly cover what makes up a ci-tron board farm.
- Ci-Tron Gateway: This component is the central hub that manages devices.
- PDU (Power Delivery Unit): A PDU is a device that manages the electrical power distribution to all the components in the CI farm. It allows you to remotely control the power, including power cycling devices, which is crucial for automating device management.
- DUT (Device Under Test): The heart of the CI farm-these are the devices where the actual testing happens.
The Long Road to a Working Farm
Over the past few months, I've been slowly preparing for the big ci-tron transition. The first step was ensuring my PDU was compatible. It wasn't initially supported, but after some hacking, I got it working and submitted a merge request (MR). After a few rounds of revisions, it was merged, expanding ci-tron's PDU support significantly.
The next and most critical step was getting a DUT to boot up correctly. Initially, ci-tron only supported iPXE as a boot method, but my devices are using U-Boot. I tried to make it work anyway, but the network initialization failed too often, and I found myself sinking hours into debugging.
Thankfully, rudimentary support for a U-Boot based boot flow was eventually added. After some tweaks, I managed to get my DUTs booting - but not without complications. A major problem was getting the correct Device Tree Blob (DTB) to load, which was needed for ci-tron's training rounds. A Device Tree Blob (DTB) is a binary representation of the hardware layout of a device. The DTB is used by the Linux kernel to understand the hardware configuration, including components like the CPU, memory, and peripherals. In my case, ensuring that the correct DTB was provided was crucial for the DUT to boot and be correctly managed by ci-tron. While integrating the DTB into U-Boot was suggested, it wasn't ideal. Updating the bootloader just to change a DTB is cumbersome, especially with multiple devices in the farm.
With the booting issue taking up too much time, I decided to put it on hold and focus on something else: gfxinfo.
Gfxinfo Integration Challenges
gfxinfo is a neat feature that automatically tags a DUT based on the GPU model in the system, avoiding the need for manually assigning tags like gc2000
. In theory, it's very convenient-but in practice, there were hurdles.
gfxinfo tags Vivante GPUs using the device tree node information. However, since Vivante GPUs are quite generic, they don't have a specific model property that uniquely identifies them. The plan was to pull this information using ioctl()
calls to the etnaviv kernel driver. It took a lot of back and forth in review due to the internal gfxinfo API being under-documented, but after a lot of effort, I finally got the necessary code merged. You can find all of it in this MR.
Final Push: Getting Everything to Boot
There was still one major obstacle - getting the DUT to boot reliably. Luckily, mupuf was already working on it and made a significant MR with over 80 patches to address the boot issues. Introducing "boots db," a feature designed to decouple the boot process, granting full control over DHCP, TFTP, and HTTP servers to each job. This is paired with YAML configurations to flexibly define the boot specifics for each board.
As of a few days ago, the latest official ci-tron gateway image contains everything needed to get an etnaviv DUT up and running successfully.
I have to say, I'm very impressed with the end result. It took a lot longer than I had anticipated, but we finally have a plug-and-play CI farm solution for etnaviv. There are still a few missing features-like Network Block Device (NBD) support and some advanced statistics-but the ci-tron team is doing an excellent job, and I'm optimistic about what's coming next.
Conclusion: A Long Road, but Worth It
The journey to get the etnaviv board farm back online was longer than expected, full of unexpected challenges and technical hurdles. But it was worth it. The result is a robust, automated solution that makes CI testing easier and more reliable for everyone. With ci-tron, it's easier to find and fix problems before they ever make it into the code base, which is exactly what a good CI setup should be all about. There is still some work to be done on the GitLab side to switch all etnaviv jobs to the new board farm.
If you're thinking about setting up your own CI farm or migrating to ci-tron, I hope my experience helps smooth the road for you a bit. It might be a long journey, but the end results are absolutely worth it.
30 Oct 2024 12:00am GMT
28 Oct 2024
planet.freedesktop.org
Maira Canal: Unleashing Power: Enabling Super Pages on the RPi
Unleashing the power of 3D graphics in the Raspberry Pi is a key commitment for Igalia through its collaboration with Raspberry Pi. The introduction of Super Pages for the Raspberry Pi 4 and 5 marks another step in this journey, offering some performance enhancements and more efficient memory usage. In this post, we'll dive deep into the technical details of Super Pages, discuss the challenges we faced during implementation, and illustrate the benefits this feature brings to the Raspberry Pi ecosystem.
What are Super Pages?
A Memory Management Unit (MMU) is a hardware component responsible for handling memory access at the system level. It translates virtual addresses used by programs into physical addresses in main memory, enabling efficient memory management and protection. The MMU allows the operating system to allocate memory dynamically, isolating processes from one another to prevent them from interfering with each other's memory.
Recommendation: 📚 Structured computer organization by Andrew Tanenbaum
The V3D MMU, which is part of the Broadcom GPU found in the Raspberry Pi 4 and 5, is responsible for translating 32-bit virtual addresses (VA) used by V3D into 40-bit physical addresses used externally to V3D. The MMU relies on a page table, stored in physical memory, which maps virtual addresses to their corresponding physical addresses. The operating system manages this page table, and the MMU uses it to perform address translation during memory access.
A fundamental principle of modern operating systems is that memory is not stored contiguously. Instead, a contiguous block of memory is divided into smaller blocks, called "pages", which are scattered across the entire address space. These pages are typically 4KB in size. This approach enables more efficient memory management and allows for features like virtual memory and memory protection.
Over the years, the amount of available memory in computers has increased dramatically. An early IBM PC had up to 640 KiB of RAM, whereas the ThinkPad I'm typing on right now has 32 GB of RAM. Naturally, memory demands have grown alongside this increase. Today, it's common for web browsers to consume several gigabytes of RAM, and a single shader can take up multiple megabytes.
As memory usage grows, a 4KB page size may become inefficient for managing large memory blocks. Handling a large number of small pages for a single block means the MMU must perform multiple address translations, which increases overhead. This can reduce the effectiveness of the Translation Lookaside Buffer (TLB), as it must store and handle more entries, potentially leading to more cache misses and reduced overall performance.
This is why many CPU manufacturers have introduced support for larger page sizes. For instance, x86 CPUs typically support 4KB and 2MB pages, with 1GB pages available if supported by the hardware. Similarly, ARM64 CPUs can support 4KB, 16KB, and 64KB page sizes. These larger page sizes help reduce the number of pages the MMU needs to manage, improving performance by reducing the overhead of address translation and making more efficient use of the TLB.
So, if CPUs are using bigger sizes, why shouldn't GPUs do the same?
By default, V3D supports 4KB pages. However, by setting specific bits in the page table entry, it is possible to create 64KB "Big Pages" and 1MB "Super Pages." The issue is that the current V3D driver available in Linux does not enable the use of Big or Super Pages, meaning this hardware feature is currently unused.
The advantage of enabling Big and Super Pages is that once an entry for any page within a Big or Super Page is cached in the MMU, it can be used to translate all virtual addresses within that page's range without needing to fetch additional entries. In theory, this should result in improved performance, especially for applications with high memory demands, such as those using multiple large buffer objects (BOs).
As Igalia continually strives to enhance the experience for Raspberry Pi users, we decided to implement this feature in the upstream kernel. But before diving into the implementation details, let's take a look at the real-world results and see if the theoretical benefits of Super Pages have translated into measurable improvements for Raspberry Pi users.
What Does This Feature Mean for RPi Users?
With Super Pages implemented, let's now explore the actual performance improvements observed on the Raspberry Pi and see how impactful this feature is for users.
Benchmarking Super Pages: Traces and FPS Improvements
To measure the impact of Super Pages, we tested a variety of games and demos traces on the Raspberry Pi 4 and 5, covering genres from action to racing. On average, we observed a +1.40% FPS improvement on the Raspberry Pi 4 and a +1.30% improvement on the Raspberry Pi 5.
For instance, on the Raspberry Pi 4, Warzone 2100 saw an 8.36% FPS increase, and on the Raspberry Pi 5, Quake II enjoyed a 3.62% boost. These examples demonstrate the benefits of Super Pages in resource-demanding applications, where optimized memory handling becomes critical.
Raspberry Pi 4 FPS Improvements
Trace | Before Super Pages | After Super Pages | Improvement |
---|---|---|---|
warzone2100.30secs.1024x768.trace | 56.39 | 61.10 | +8.36% |
ue4_shooter_game_shooting_low_quality_640x480.gfxr | 20.71 | 21.47 | +3.65% |
quake3e_capture_frames_1800_through_2400_1920x1080.gfxr | 60.88 | 62.50 | +2.67% |
supertuxkart-menus_1024x768.trace | 112.62 | 115.61 | +2.65% |
ue4_shooter_game_shooting_high_quality_640x480.gfxr | 20.45 | 20.88 | +2.10% |
quake2-gles3-1280x720.trace | 59.76 | 60.84 | +1.82% |
ue4_sun_temple_640x480.gfxr | 27.60 | 28.03 | +1.54% |
vkQuake_capture_frames_1_through_1200_1280x720.gfxr | 54.59 | 55.30 | +1.29% |
ue4_shooter_game_low_quality_640x480.gfxr | 32.75 | 33.08 | +1.00% |
sponza_demo02_800x600.gfxr | 20.90 | 21.03 | +0.61% |
supertuxkart-racing_1024x768.trace | 8.58 | 8.63 | +0.60% |
ue4_shooter_game_high_quality_640x480.gfxr | 19.62 | 19.74 | +0.59% |
serious_sam_trace02_1280x720.gfxr | 44.00 | 44.21 | +0.50% |
ue4_vehicle_game-2_640x480.gfxr | 12.59 | 12.65 | +0.49% |
sponza_demo01_800x600.gfxr | 21.42 | 21.46 | +0.19% |
quake3e-1280x720.trace | 84.45 | 84.52 | +0.09% |
Raspberry Pi 5 FPS Improvements
Trace | Before Super Pages | After Super Pages | Improvement |
---|---|---|---|
quake2-gles3-1280x720.trace | 151.77 | 157.26 | +3.62% |
supertuxkart-menus_1024x768.trace | 306.79 | 313.88 | +2.31% |
warzone2100.30secs.1024x768.trace | 140.92 | 144.03 | +2.21% |
vkQuake_capture_frames_1_through_1200_1280x720.gfxr | 131.45 | 134.20 | +2.10% |
ue4_vehicle_game-2_640x480.gfxr | 24.42 | 24.88 | +1.89% |
ue4_shooter_game_high_quality_640x480.gfxr | 32.12 | 32.53 | +1.29% |
ue4_sun_temple_640x480.gfxr | 42.05 | 42.55 | +1.20% |
ue4_shooter_game_shooting_high_quality_640x480.gfxr | 52.77 | 53.31 | +1.04% |
quake3e-1280x720.trace | 238.31 | 240.53 | +0.93% |
warzone2100.70secs.1024x768.trace | 151.09 | 151.81 | +0.48% |
sponza_demo02_800x600.gfxr | 50.81 | 51.05 | +0.46% |
supertuxkart-racing_1024x768.trace | 20.91 | 20.98 | +0.33% |
ue4_shooter_game_low_quality_640x480.gfxr | 59.68 | 59.86 | +0.29% |
quake3e_capture_frames_1_through_1800_1920x1080.gfxr | 167.70 | 168.17 | +0.29% |
ue4_shooter_game_shooting_low_quality_640x480.gfxr | 53.40 | 53.51 | +0.22% |
quake3e_capture_frames_1800_through_2400_1920x1080.gfxr | 163.37 | 163.64 | +0.17% |
serious_sam_trace02_1280x720.gfxr | 60.00 | 60.03 | +0.06% |
sponza_demo01_800x600.gfxr | 45.04 | 45.04 | <.01% |
While an average +1% FPS improvement might seem modest, Super Pages can deliver more noticeable gains in memory-intensive 3D applications and when the GPU is under heavy usage. Let's see how the Super Pages perform on Mesa CI.
Benchmarking Super Pages: Mesa CI Job Duration
To avoid introducing regressions in user-space, I usually test my custom kernels with Mesa CI, focusing on the "broadcom-postmerge" stage to verify that all Piglit and CTS tests ran smoothly. For Super Pages, I was pleasantly surprised by the job duration results, as some job durations were reduced by several minutes.
Mesa CI Jobs Duration Improvements
Job | Before Super Pages | After Super Pages |
---|---|---|
v3d-rpi4-traces:arm64 | ~4m30s | ~3m40s |
v3d-rpi5-traces:arm64 | ~3m30s | ~2m45s |
v3d-rpi4-gl-full:arm64 */6 | ~24-25 minutes | ~22-23 minutes |
v3d-rpi5-gl-full:arm64 | ~48 minutes | ~48 minutes |
v3dv-rpi4-vk-full:arm64 */6 | ~44 minutes | ~41 minutes |
v3dv-rpi5-vk-full:arm64 | ~102 minutes | ~92 minutes |
Seeing these reductions is especially rewarding. For example, the "v3dv-rpi5-vk-full:arm64" job duration decreased by 10 minutes, meaning more FPS for users and shorter wait times for Mesa developers.
Benchmarking Super Pages: PS2 Emulation
After sharing a couple of tables, I'll admit that showcasing performance improvements solely through numbers doesn't always convey the real impact. Personally, I find it more satisfying to see performance gains in action with real-world applications.
This led me to explore PlayStation 2 (PS2) emulation on the RPi 5. From watching YouTube videos, I noticed that PS2 is a popular console for the RPi 5. While the PlayStation (PS1) emulates well even on the RPi 4, and Nintendo 64 and Sega Saturn struggle across most hardware, PS2 hits a sweet spot for testing the RPi 5's limits.
Fortunately, I still have my childhood PS2 - my second console after the Nintendo GameCube, and one of the most successful consoles worldwide, including in Brazil. With a library packed with titles like Metal Gear Solid, Resident Evil, Tomb Raider, and Shadow of the Colossus, the PS2 remains a great system for collectors and retro gamers alike.
I selected a few games from my collection to benchmark on the RPi 5 using a PS2 emulator. My emulator of choice was Aether SX2 with Vulkan support. Although AetherSX2 is no longer in development, it still performs well on the RPi.
Initially, many games were barely playable, especially those with large buffer objects, like Shadow of the Colossus and Gran Turismo 4. However, after enabling Super Pages support, I noticed immediate improvements. For example, Shadow of the Colossus wouldn't even open before Super Pages, and while it's not fully playable yet, it does load now. This isn't a silver bullet, but it's a step forward in improving the driver one piece at a time.
I ended up selecting four games for a video comparison: Burnout 3: Takedown, Metal Gear Solid 3: Snake Eater, Resident Evil 4, and Tekken 4.
Disclaimer: The BIOS used in the emulator was extracted from my own PS2, and I played only games I own, with ROMs I personally extracted. Neither I nor Igalia encourage using downloaded BIOS or ROM files from the internet.
From the video, we can see noticeable improvements in all four games. Although they aren't perfectly playable yet, the performance gains are evident, particularly in Resident Evil 4, where the gameplay saw a solid 5 FPS boost. I realize 18 FPS might not satisfy most players, but I still had a lot of fun playing Resident Evil 4 on the RPi 5.
When tracking the FPS for these games, it's clear that the performance gains go well beyond the average 1% seen in other benchmarks. Super Pages show their true potential in high-memory applications like PS2 emulation.
Having seen the performance gains Super Pages can bring to the Raspberry Pi, let's now dive into the technical aspects of the feature.
Implementing Super Pages
The first challenge was figuring out how to allocate a contiguous block of memory using shmem. The Shared Memory Virtual Filesystem (shmem) is used as a flexible memory mechanism that allows the GPU and CPU to share access to BOs through the system's temporary filesystem, tmpfs. tmpfs is a volatile filesystem that stores files in RAM, making it ideal for temporary or high-speed data that doesn't need to persist on RAM.
For example, to allocate a 256KB BO across four 64KB pages, we need four contiguous 64KB memory blocks. However, by default, tmpfs only allocates memory in PAGE_SIZE
chunks (as seen in shmem_file_setup()
), whereas PAGE_SIZE
is 4KB on the Raspberry Pi 4 and 16KB on the Raspberry Pi 5. Since the function drm_gem_object_init()
- which initializes an allocated shmem-backed GEM object - relies on shmem_file_setup()
to back these objects in memory, we had to consider alternatives, as the default PAGE_SIZE
would divide memory into increments that are too small to ensure the large, contiguous blocks needed by the GPU.
The solution we proposed was to create drm_gem_object_init_with_mnt()
, which allows us to specify the tmpfs mountpoint where the GEM object will be created. This enables us to allocate our BOs in a mountpoint that supports larger page sizes. Additionally, to ensure that our BOs are allocated in the correct mountpoint, we introduced drm_gem_shmem_create_with_mnt()
, which allows the mountpoint to be specified when creating a new DRM GEM shmem object.
[PATCH v6 04/11] drm/gem: Create a drm_gem_object_init_with_mnt() function
[PATCH v6 06/11] drm/gem: Create shmem GEM object in a given mountpoint
The next challenge was figuring out how to create a new mountpoint that would allow for different page sizes based on the allocation. Simply creating a new tmpfs mountpoint with a fixed bigger page size wouldn't suffice, as we needed flexibility for various allocations. Inspired by the i915 driver, we decided to use a tmpfs mountpoint with the "huge=within_size" flag. This flag, which requires the kernel to be configured with CONFIG_TRANSPARENT_HUGEPAGE
, enables the allocation of huge pages.
Transparent Huge Pages (THP) is a kernel feature that automatically manages large memory pages to improve performance without needing changes from applications. THP dynamically combines smaller pages into larger ones, typically 2MB, reducing memory management overhead and improving cache efficiency.
To support our new allocation strategy, we created a dedicated tmpfs mountpoint for V3D, called gemfs, which provides us an ideal space for managing these larger allocations.
[PATCH v6 05/11] drm/v3d: Introduce gemfs
With everything in place for contiguous allocations, the next step was configuring V3D to enable Big/Super Page support.
We began by addressing a major source of memory pressure on the Raspberry Pi: the current 128KB alignment for allocations in the virtual memory space. This alignment wastes space when handling small BO allocations, especially since the userspace driver performs a large number of these small allocations.
As a result, we can't fully utilize the 4GB address space available for the GPU on the Raspberry Pi 4 or 5. For example, we can currently allocate up to 32,000 BOs of 4KB (~140MB) and 3,000 BOs of 400KB (~1.3GB). This becomes a limitation for memory-intensive applications. By reducing the page alignment to 4KB, we can significantly increase the number of BOs, allowing up to 1,000,000 BOs of 4KB (~4GB) and 10,000 BOs of 400KB (~4GB).
Therefore, the first change I made was reducing the VA alignment of all allocations to 4KB.
[PATCH v6 07/11] drm/v3d: Reduce the alignment of the node allocation
With the alignment issue resolved, we can now implement the code to properly set the flags on the Page Table Entries (PTE) for Big/Super Pages. Setting these flags is straightforward - a simple bitwise operation. The challenge lies in determining which BOs can be allocated in Super Pages. For a BO to be eligible for a Big Page, its virtual address must be aligned to 64KB, and the same applies to its physical address. Same thing for Super Pages, but now the addresses must be aligned to 1MB.
If the BO qualifies for a Big/Super Page, we need to iterate over 16 4KB pages (for Big Pages) or 256 4KB pages (for Super Pages) and insert the appropriate PTE.
Additionally, we modified the way we iterate through the BO's memory. This was necessary because the THP may not always allocate the entire BO contiguously. For example, it might only allocate contiguously 1MB of a 2MB block. To handle this, we now iterate over the blocks of contiguous memory scattered across the scatterlist, ensuring that each segment is properly handled during the allocation process.
What is a scatterlist? It is a Linux Kernel data structure that manages non-contiguous memory as if it were contiguous. It organizes separate memory blocks into a single logical buffer, allowing efficient data handling, especially in Direct Memory Access (DMA) operations, without needing a physically contiguous memory allocation.
[PATCH v6 08/11] drm/v3d: Support Big/Super Pages when writing out PTEs
However, the last few patches alone don't fully enable the use of Super Pages. While PATCH 08/11 technically allows for Super Pages, we're still relying on DRM GEM shmem objects, meaning allocations are still happening in PAGE_SIZE
chunks. Although Big/Super Pages could potentially be used if the system naturally allocated 1MB or 64KB contiguously, this is quite rare and not our intended outcome. Our goal is to actively use Big/Super Pages as much as possible.
To achieve this, we'll utilize the V3D-specific mountpoint we created earlier for BO allocation whenever possible. By creating BOs through drm_gem_shmem_create_with_mnt()
, we can ensure that large pages are allocated contiguously when possible, enabling the consistent use of Big/Super Pages.
[PATCH v6 09/11] drm/v3d: Use gemfs/THP in BO creation if available
And there you have it - Big/Super Pages are now fully enabled in V3D. The only requirement to activate this feature in any given kernel is ensuring that CONFIG_TRANSPARENT_HUGEPAGE
is enabled.
Final Words
You can learn more about ongoing enhancements to the Raspberry Pi driver stack in this XDC 2024 talk by José María "Chema" Casanova Crespo. In the talk, Chema discusses the Super Pages work I developed, along with other advancements in the driver stack.
Of course, there are still plenty of improvements on the horizon at Igalia. I'm currently experimenting with 64KB CLE allocations in user-space, and I hope to share more good news soon.
Finally, I'd like to express my gratitude to Iago Toral and Tvrtko Ursulin for their invaluable support in developing Super Pages for the V3D kernel driver. Thank you both for sharing your experience with me!
28 Oct 2024 12:00pm GMT
23 Oct 2024
planet.freedesktop.org
Bastien Nocera: wireless_status kernel sysfs API
(I worked on this feature last year, before being moved off desktop related projects, but I never saw it documented anywhere other than in the original commit messages, so here's the opportunity to shine a little light on a feature that could probably see more use)
The new usb_set_wireless_status() driver API function can be used by drivers of USB devices to export whether the wireless device associated with that USB dongle is turned on or not.
To quote the commit message:
This will be used by user-space OS components to determine whether the battery-powered part of the device is wirelessly connected or not, allowing, for example: - upower to hide the battery for devices where the device is turned off but the receiver plugged in, rather than showing 0%, or other values that could be confusing to users - Pipewire to hide a headset from the list of possible inputs or outputs or route audio appropriately if the headset is suddenly turned off, or turned on - libinput to determine whether a keyboard or mouse is present when its receiver is plugged in.
This is not an attribute that is meant to replace protocol specific APIs [...] but solely for wireless devices with an ad-hoc "lose it and your device is e-waste" receiver dongle.
Currently, the only 2 drivers to use this are the ones for the Logitech G935 headset, and the Steelseries Arctis 1 headset. Adding support for other Logitech headsets would be possible if they export battery information (the protocols are usually well documented), support for more Steelseries headsets should be feasible if the protocol has already been reverse-engineered.
As far as consumers for this sysfs attribute, I filed a bug against Pipewire (link) to use it to not consider the receiver dongle as good as unplugged if the headset is turned off, which would avoid audio being sent to headsets that won't hear it.
UPower supports this feature since version 1.90.1 (although it had a bug that makes 1.90.2 the first viable release to include it), and batteries will appear and disappear when the device is turned on/off.
A turned-on headset
23 Oct 2024 12:06pm GMT
20 Oct 2024
planet.freedesktop.org
Simon Ser: Status update, October 2024
Hi!
This month XDC 2024 took place in Montreal. I wasn't there in-person, but thanks to the organizers I could still ask questions and attend workshops remotely (thanks!). As usual, XDC has been a great reminder of many things I wanted to do but which got buried under a pile of emails. We've discussed the upcoming KMS color management uAPI again, I've taken a bit of time to send more comments and it looks like this one is getting close to completion (famous last words). We've also discussed about display muxing (switching a connector from one GPU to another one), it's quite fun how surprisingly tricky this process is. Another topic was better multi-GPU support, in particular how to avoid going through the main GPU when an application is rendered and displayed on a secondary GPU. I've sent a proposal to improve the kernel DMA-BUF uAPI.
New this year was the Wayland workshop organized by Mike Blumenkrantz, Daniel Stone and Jonas Ådahl. We've discussed the governance change proposals sent earlier this month. Various changes are being discussed, all have the goal to lower the barrier to entry when contributing a protocol and preventing patches from getting stuck. I'm excited to see how this turns out!
We've finally started the release candidate cycle for Sway 1.10. I've released Sway 1.10-rc4 this weekend with a bunch more fixes, I'm hoping the final release can go out soon! I've also released the long overdue cage 0.2.0, which fast forwards wlroots to version 0.18 and adds primary selection support.
I've sent a patch to add a udmabuf allocator to wlroots. This is useful for running the wlroots GLES2 and Vulkan renderers with software rendering (e.g. llvmpipe and lavapipe), which is handy for CI and exercises the same codepaths as real hardware instead of the seldom used Pixman renderer.
wlroots-rs has been updated to wlroots v0.18, and I've revamped the way the compositor state is managed. Previously the library forced the use of Rc<RefCell<T>>
to hold the state, which caused issues with double mutable borrows at runtime when compositor callbacks were nested (wlroots invokes compositor callback which borrows state and calls into wlroots which invokes another compositor callback which borrows state). With the new design the compositor must pass its state as an argument to all wlroots functions which may emit signals and call back into the compositor.
delthas has contributed a whole bunch of soju patches used by his new hosted bouncer service, IRC Today. Uploaded videos and PDF files can now be viewed inline in Web browsers, a new HTTP basic authentication backend has been added, file uploads can now be delegated to a separate HTTP backend, a new soju.im/SAFERATE
specification indicates when clients don't need to rate-limit their messages, and a bunch of various smaller improvements and fixes. A bunch of exciting new features are in the pipeline as well (but I won't spoil them just yet)!
Matthew Hague has contributed TLS certificate pinning to Goguma. When hitting an invalid certificate, Goguma will now offer the user a choice to trust this specific certificate (trust on first use). gamja now supports drag-and-drop for file uploads thanks to xse. Both gamja and Goguma have moved to Codeberg, I hope this lowers the barrier to entry for contributing. A tiny NPotM is soju-containers¸ a repository containing Dockerfiles for soju and gamja, for easy deployment and testing.
Both hottub and yojo now have support for build secrets. For hottub, secrets are only enabled when the owner pushes commits (and enables the feature at setup time). For yojo, the owner needs to enable the feature at setup time and can then select specific secrets to expose on specific repositories. All of this is locked down to prevent collaborators from gaining access to arbitrary secrets when pushing to a repository.
That's all for now, see you next month!
20 Oct 2024 10:00pm GMT
15 Oct 2024
planet.freedesktop.org
Mike Blumenkrantz: Recovery
Struggling
Last week was XDC. I did too much Wayland, and now I've been stricken with a plague for my hubris.
I have some updates, but I lack the ability to fully capture the exploits of Mesa's most sane developer in the ten minutes I'm awake every day. In the meanwhile, let's take a look another potential example of great hubris.
Hm.
Have you ever made a decision that seemed great at the time but then you realized later it was actually maybe not that great? Like, maybe it was actually really, uh, well, not dumb since nobody reading this blog would do something like that, but not…smart. And everyone else was kinda going along with your decision and trusting that you knew what you were talking about because let's face it, you're smart. Everyone knows how smart you are. That's why they trust you to make these decisions.
Long-time SGC readers know I'm not one to make decisions of any kind, but we all remember that time Microsoft famously introduced Work Graphs to D3D and also (quietly) deprecated ExecuteIndirect. The argument was compelling: why not just move all the work to the GPU?
Haters described Work Graphs as just another attempt by the driver cartel to blame bugs on app developers by making tooling impossible. The rest of us were all in-We jumped on that bandwagon like it was the last triangle in the pipe before a crash. It wasn't long before the high-powered players were aboard:
Details were light at this stage. There were no benchmarks, no performance numbers, no games or applications using Work Graphs, but everyone trusted Microsoft. Everyone knew the idea of this tech was sound, that it had to be faster.
Microsoft doubled down: Work Graphs would support mesh nodes for drawing!
Other graphics wizards began to get involved. The developerverse was in a tizzy. Everyone wanted in on the action.
The hype train had departed the station.
Hm?
Six months after GDC, the first notable performance figures for Work Graphs were blogged about by AAA graphics rockstar, Kostas Anagnostou. I was at a Khronos F2F when it happened, and the number of laptop screens open to the post when it dropped was nonzero. Very nonzero.
At best, the figures were whelming.
Still there was no real analysis of Work Graph performance in comparison to alternative solutions. Haters will say I'm biased after recently shipping Vulkan's device generated commands extension, but this was going to ship regardless since vkd3d-proton requires cross-vendor compatibility for ExecuteIndirect functionality used in games like Halo Infinite and Starfield. I'm all about the numbers. Show me the graphs. The perf graphs, that is.
Fortunately, friend of the blog and veteran vertex wrangler, Hans-Kristian Arntzen, always has my back. He's spent the past few months heroically writing vkd3d-proton emulation for Work Graphs, and he has recently posted his findings to an obscure README in that repository.
READ IT. SERIOUSLY. YES, THIS IS A FULL PAGE-WIDTH LINK SO YOU CAN'T POSSIBLY MISS IT.
If you're just here for the quick summary (which you shouldn't be considering how much time he has spent making charts and graphs, and taking screenshots, and summing everything up in bite-sized morsels for easy consumption):
- Across the board, Work Graph performance is not very exciting
- Emulation with core Vulkan compute shader features is up to 3x faster
- Comparison test cases against ExecuteIndirect (which show EI being worse) do not effectively leverage that functionality, as noted by Hans-Kristian nearly six months ago
The principle of charity requires taking serious claims in the best possible light. This should have yielded robust, powerful ExecuteIndirect benchmark usage (and even base compute/mesh shader usage) to provide competitive benchmarks against Work Graph functionality. At the time of writing, those benchmarks have yet to materialize, and the only test cases are closer to strawmen that can be held up for an easy victory.
I'm not saying that Work Graphs are inherently bad.
Yet.
At this point, however, I haven't seen compelling evidence which validates the hype surrounding the tech. I haven't seen great benchmarks and demos. Maybe it's a combination of that and still-improving driver support. Maybe it's as-yet available functionality awaiting future hardware. In any case, I haven't seen a strong, fact-based technical argument which proves, beyond a doubt, that this is the future of graphics.
Before anyone else tries to jump on the Work Graph hype train, I think we owe it to ourselves to thoroughly interrogate this new paradigm and make sure it provides the value that everyone expects.
15 Oct 2024 12:00am GMT