25 Jun 2017

feedplanet.freedesktop.org

Rob Clark: long overdue update

Since it has been a while since the last update, I guess it is a good time to post an update on some of the progress that has been happening with freedreno and upstream support for snapdragon boards.

freedreno / mesa

While the 17.1 release included enabling reorder support by default, there have been many other interesting features landed since the 17.1 branch point (so they will be included in the future 17.2 release). Many, but not all, are related to a5xx. (Something that I just realized I forgot to blog about, but have demoed here and there.)

GL/GLES Compute Shaders:

So far this is only a5xx (although a4xx seems to work similarly, and would probably be not too hard to get working if someone had the right hardware and a bit of time). SSBOs and atomics are supported, but image support (an important part of compute shaders) is still TODO (and some r/e required, although it seems to share a lot in common with SSBOs). Adreno 3xx support for compute shaders appears to be more work (ie. less in common with a4xx/a5xx, and probably part of the reason that qualcomm never bothered adding support in android blob driver). Patches welcome, but for now a3xx compute support is far enough down my TODO list that it might not otherwise happen.

I know there is a lot of interest in open source OpenCL support for freedreno, and hopefully that is something that will come in the future. But there is the big challenge of how to get opencl shaders (kernels) into a form that can be consumed by freedreno's ir3 shader compiler backend. While there is some potential to re-use spirv_to_nir at some point, there are some complicated details. For compute kernels (ie. OpenCL) there are some restrictions lifted on SPIRV that spirv_to_nir relies on. (Little details like lack of requirement for structured flow control.)

A5xx HW Binning Support:

Traditionally hw binning support, while a pretty big perf boost, has been kinda difficult (translation: lot of things can be done wrong to lead to difficult to debug GPU lockups), this time around it wasn't so hard. I guess experience on a3xx/a4xx has helped. And everyone loves ~30% fps boost in your favorite game!

This has brought performance roughly up to the levels as ifc6540/a420. Which sounds bad, but remember we are comparing apples and oranges. On ifc6540 (snapdragon 805), we don't yet have upstream kernel support so this was using a 3.10 android kernel (with bus-scaling and all the downstream tricks to optimize memory bandwidth and overall SoC performance). But on a530 (dragonboard820c), I never had a working downstream kernel (or had to bother backporting the upstream drm/msm driver to some ancient android kernel.. hurray!). The upshot is that any perf #'s for a5xx don't include bus-scaling, cpufreq, etc. I expect a pretty big performance boost on a530 once we have a way to clock up memory/interconnects. (Ie. on micro-benchmarks a530 is >2x faster than a420 on alu limited workloads, but still a bit slower than a420 on bandwidth limited workloads, despite having a higher theoretical bandwidth.)

Side note, linaro is working on an upstream solution for bus-scaling. This is a very important improvement needed upstream for ARM SoC's, especially ones that optimize so strongly for battery life. (Keep in mind that interconnects, which span across the SoC, and memory, are a big power consumer in a modern SoC.. so a lot of qualcomm's good performance + battery life in phones comes down to these systemwide optimizations.) It is equivalent to slow memory clockings on some generations of nouveau, except in this case it is outside the gpu driver (ie. we aren't talking about vram on a discrete gpu), and the reason is to enable a high end phone SoC to last a couple days on battery, rather than keeping your video card from melting.

A5xx gles3.0/gl3.1 support:

Probably it would have made sense to spend time on this before compute shaders (since they are otherwise only exposed with $MESA_GL_VERSION_OVERRIDE tricks.. but hey, I was curious about how compute shaders worked). After an assortment of small things to r/e and implement, we where just a few (~50) texture/vbo/fb formats away from gl3.1. Nothing really exiting. Mostly just a few weekends probing unknown format #'s and seeing which piglit format tests started passing. The sort have thing that would have taken approximately 10 minutes with docs.. but hey, it needed to be done.

Switching to NIR by default:

This is one thing that benefits a3xx and a4xx as well as a5xx. While freedreno has had NIR support for a while, it hasn't been enabled by default until more recently. The issue was handling of complex dereferences (multi-dimensional arrays, arrays of structs, etc). The problem was that freedreno's ir3 backend preferred to keep things in SSA form (since that gives the instruction scheduler more flexibilty, which is pretty imprortant in the a3xx+ instruction set architecture (ir3)). Adding support to lower arrays to regs allowed moving the deref offset calculation to NIR so that we wouldn't regress by turning NIR on by default. This is useful since it cuts shader compilation time, but also because tgsi_to_nir doesn't support SSBOs, atomics, and other new shiny glsl features. (Now we only rely on tgsi_to_nir for various legacy paths and built-in blit shaders which don't need new shiny glsl features.)

A5xx HW Query Support:

Adreno 5xx changed how hw queries (ie. occlusion query and time-elapsed query, etc) work. For the better, since now we can accumulate per-tile results on the GPU. But it required some new support in freedreno for a different sort of query, and some r/e about how this actually worked. And while we had previously lied about occlusions query support (mostly to expose more than gl1.4 support), that isn't a very good long term solution. In addition, time-elapsed query is useful for performance/profiling work, so helpful for some of the following projects.

A5xx LRZ Support:

Adreno 5xx adds another cute optimization called "LRZ". (Presumably "low resolution Z (depth buffer)". I've spent a some time r/e'ing this feature and implementing support for it in freedreno. It is a neat new hw trick that a5xx has, which serves two purposes.
The basic idea is to have a per-quad depth value so that in the binning pass primitives can be rejected (per tile) based on depth (ie. reject more early). But then recycle the LRZ buffer in draw phase to function as for-free depth pre-pass (ie. reject earlier primitives based on the z value of later primitives).

The benefit depends on how well optimized the game is. Ie. games that are well optimized for traditional GPU architectures (ie. sorting geometry, already doing depth pre-passes, etc) won't benefit as much.. but this helps a lot for badly written games that relied on per-pixel deferred rendering.

Overall, for things like stk/xonotic, it seems like a ~5-10% win.
The main remaining performance trick for a5xx is UBWC (ie. bandwidth compression) + tiled textures. I've worked out mostly how UBWC works (in particular texture layout, at least for 2d textures + mipmap, but I think we can infer how 2d arrays, 3d, etc, work from that). Most of the infrastructure for upload/download blits (to convert to/from linear) should be easier thanks to the reorder support. We'll see if I actually find time to implement it before the mesa 17.2 branch point.

Standardized Embedded Nonsense Hacks

Anyone who has dealt with arm (non-server) devices, should be familiar with the silly-embedded-nonsense-hacks world. In particular the non-standard boot-chain which makes it difficult for distro's to support the plethora of arm boards (let alone phones/tablets/etc) out there without per-board support. Which was fine in the early days, but N boards times M distro's, it really doesn't scale.

Thanks to work by Mateusz Kulikowski, we now have u-boot support for dragonboard 410c. It's been on my TODO list to play with for a while. But more recently I realized that u-boot, thanks to the work of many others, can provide enough of EFI runtime-services interface for grub to work. This means that it is a path forward for standardized distros on aarch64 (like fedora and opensuse), which expect UEFI, to boot on boards which don't otherwise have UEFI firmware.

So I decided to spend a bit of time pretending to be a crack smoking firmware engineer. (Not literally, of course.. that would be stupid!)

After fixing some linker script bugs with u-boot's db410c support vs efi_runtime section, and debugging some issues with grub finding the boot disk with the help of Peter Jones (the resident grub/EFI expert who conveniently sits near me), and a couple other misc u-boot fixes, I had a fedora 26 alpha image booting on the db410c.

The next step was figuring out display, so we could have grub boot menu on screen, like you would expect on a grown-up platform. As it turns out, on most devices, lk (little kernel, ie. what normally loads the kernel+initrd on snapdragon android devices) already supports lighting up the display, since most/all android devices put up the initial splash-screen before the kernel is loaded. Unfortunately this was not the case with the db410c's lk. But Archit (qcom engineer who has contributed a whole lot of drm/msm and other drm patches) pointed me at a different lk branch (among the 100's) which had msm8916 display + adv7533 dsi->hdmi bridge (like what db410c uses). After digging through a convoluted git history, I was able to track down the relevant gpio/i2c/adv7533 patches to port to the lk branch used on db410c.

After that, I added support for lk to populate a framebuffer node, using the simple-framebuffer bindings to pass the pre-configured scanout buffer (+dimensions) to u-boot. This plus a new simplefb video driver for u-boot, enables u-boot to expose display support to grub via the EFI GOP protocol. (Along the way I had to add 32bpp rgb support to lk since u-boot and grub don't understand packed 24bpp rgb.)

All this got to the point of:


This is a fedora image, booting off of usb disk (ie. not just rootfs on usb disk, but also grub/kernel/initrd/dtb). With graphical grub menu to select which kernel to boot, just like you would expect on a PC. The grubaa64.efi here is vanilla distro boot-loader, and from the point of view of the distro image, lk/u-boot is just the platform's firmware which somehow provides the UEFI interface the distro media expects. It is worth pointing out some advantages of a traditional lk->kernel boot chain:
  • booting from USB, network, etc (which lk cannot do)
  • doesn't require kernel packed in custom boot.img partition which is board specific
  • booting installer image (ie. from sd-card or network)
When the kernel starts, in early boot, it is using efifb, just like it would on a PC. (Ie. so you can see what is going on on-screen before hw specific drm driver kernel module is loaded).

There are still a few rough edges. The drm/msm driver and msm clk drivers are a bit surprised when some clks are already enabled when the kernel starts, and the display is already light up.. now we have a good reason to fix some of those issues. And right now we don't have a good way to load a newer device tree binary (dtb) after a distro kernel update (ie. without updating u-boot, aka "the firmware"). (For simple SoC's maybe a pre-baked dtb for the life of the board is sufficient... I have my doubts about that for SoCs as complex as the various snapdragon's, if for no other reason that we haven't even figured out how to model all the features of the existing SoCs in devicetree.) One idea is for u-boot to pass to grub the name of the board dtb file to load via EFI variables. I've sent a very early RFC to add EFI variable support in u-boot. We'll see how this goes, in the mean time there might be more "firmware" upgrades needed than you'd normally expect on a mature platform like x86.

For now, my lk + u-boot work is here:
and prebulit "firmware" is here. For now you will need to edit distro grub.cfg to add 'devicetree' commands to load appropriate dtb since what is included with u-boot.img is a very minimal fdt (ie. just enough for the drivers in u-boot).




25 Jun 2017 10:03pm GMT

Nicolai Hähnle: ARB_gl_spirv, NIR linking, and a NIR backend for radeonsi

SPIR-V is the binary shader code representation used by Vulkan, and GL_ARB_gl_spirv is a recent extension that allows it to be used for OpenGL as well. Over the last weeks, I've been exploring how to add support for it in radeonsi.

As a bit of background, here's an overview of the various relevant shader representations that Mesa knows about. There are some others for really old legacy OpenGL features, but we don't care about those. On the left, you see the SPIR-V to LLVM IR path used by radv for Vulkan. On the right is the path from GLSL to LLVM IR, plus a mention of the conversion from GLSL IR to NIR that some other drivers are using (i965, freedreno, and vc4).

For GL_ARB_gl_spirv, we ultimately need to translate SPIR-V to LLVM IR. A path for this exists, but it's in the context of radv, not radeonsi. Still, the idea is to reuse this path.

Most of the differences between radv and radeonsi are in the ABI used by the shaders: the conventions by which the shaders on the GPU know where to load constants and image descriptors from, for example. The existing NIR-to-LLVM code needs to be adjusted to be compatible with radeonsi's ABI. I have mostly completed this work for simple VS-PS shader pipelines, which has the interesting side effect of allowing the GLSL-to-NIR conversion in radeonsi as well. We don't plan to use it soon, but it's nice to be able to compare.

Then there's adding SPIR-V support to the driver-independent mesa/main code. This is non-trivial, because while GL_ARB_gl_spirv has been designed to remove a lot of the cruft of the old GLSL paths, we still need more supporting code than a Vulkan driver. This still needs to be explored a bit; the main issue is that GL_ARB_gl_spirv allows using default-block uniforms, so the whole machinery around glUniform*() calls has to work, which requires setting up all the same internal data structures that are setup for GLSL programs. Oh, and it looks like assigning locations is required, too.

My current plan is to achieve all this by re-using the GLSL linker, giving a final picture that looks like this:

So the canonical path in radeonsi for GLSL remains GLSL -> AST -> IR -> TGSI -> LLVM (with an optional deviation along the IR -> NIR -> LLVM path for testing), while the path for GL_ARB_gl_spirv is SPIR-V -> NIR -> LLVM, with NIR-based linking in between. In radv, the path remains as it is today.

Now, you may rightfully say that the GLSL linker is a huge chunk of subtle code, and quite thoroughly invested in GLSL IR. How could it possibly be used with NIR?

The answer is that huge parts of the linker don't really that much about the code in the shaders that are being linked. They only really care about the variables: uniforms and shader inputs and outputs. True, there are a bunch of linking steps that touch code, but most of them aren't actually needed for SPIR-V. Most notably, GL_ARB_gl_spirv doesn't require intrastage linking, and it explicitly disallows the use of features that only exist in compatibility profiles.

So most of the linker functionality can be preserved simply by converting the relevant variables (shader inputs/outputs, uniforms) from NIR to IR, then performing the linking on those, and finally extracting the linker results and writing them back into NIR. This isn't too much work. Luckily, NIR reuses the GLSL IR type system.

There are still parts that might need to look at the actual shader code, but my hope is that they are few enough that they don't matter.

And by the way, some people might want to move the IR -> NIR translation to before linking, so this work would set a foundation for that as well.

Anyway, I got a ridiculously simple toy VS-PS pipeline working correctly this weekend. The real challenge now is to find actual test cases...

25 Jun 2017 9:10pm GMT

feedPlanet KDE

Go support in KDevelop. GSoC week 4. DU-Chain time!

During this week, I decided to spend more time on language support: code completion, highlighting and so on. This part is provided by DU-Chain. Du-Chain stands for Definition-Use chain which consist of various contexts, declarations in these contexts and usages of these declarations.

First change improved declaration of variables in parameters of anonymous functions. In Go language, it is possible to define anonymous function and assign it to variable, pass as parameter (people use it for various callbacks for example) or simple call it. Before my change, parameters of anonymous functions were treated as declarations only in case of assigning that function to variable. Thus, if, for example, you typed the example of Gin web framework usage:

package main
import "gopkg.in/gin-gonic/gin.v1"
func main() {
r := gin.Default()
r.GET("/ping", func(c *gin.Context) {
c.JSON(200, gin.H{
"message": "pong",
})
})
r.Run() // listen and serve on 0.0.0.0:8080
}

you would end up with "c" not being highlighted / treated as variable. After my change, parameters of anonymous functions are treated as variable declarations in all three cases: assigning, passing and calling (see screenshots).
Before

After



Second change is under review and is aimed at adding code completion from embedded structs. In Go language, there is no such thing as inheritance - composition is preferred over it. Composition often has drawback - we need to pass all calls to "base" methods so there will be a lot of boilerplate code. In Go this problem is solved using "embedding" structs so their fields and methods are added to top-level struct. For example, struct A has a method Work and struct B embeds struct A. So both B.Work() and B.A.Work() are correct. Because of that we need to travel over all embedding tree for retrieving all possible completions - this is what my second change is aimed at.
Before
After


Third change added errors of parsing as problems in "Problems" view.

Fourth change fixed misplacing usage information in wrong context. Before that change, information of variable usages was placed in top context, which had led to not working semantic highlighting - variable declarations had different colors but usages were all same. Therefore, while improving overall correctness of generated DU-Chain I also fixed that issue and now variable usages are colored too (see screenshots)!
Before
After


Apart from DU-Chain improvements I got merged a basic project manager plugin which offers a template of simple console Go application and allows to build Go projects easier.



Looking forward to next week!

25 Jun 2017 4:24pm GMT

That’s one small step for a man, one giant leap for GSoC project.

For the last month the main time I took the exams, because of this I did not do much for my project. Nevertheless, I implemented the basic primitives and tested them.

Let me tell you about them.

Wet map.

Water is the main part in watercolors. That's why I started with this.

Wet map contains 2 types of information: water value and speed vector. If the first parameter is clear, then the second one needs explanations. Speed vector needs for rewetting our splats (take it in mind, I will explain what this later).

All this values Wet map contains in KisPaintDevice:

KisPaintDeviceSP m_wetMap;

As the color space was chosen rgb16:

m_wetMap = new KisPaintDevice(KoColorSpaceRegistry::instance()->rgb16());

And there are information about water value and speed vector in pixel data.

But in this form Paint Device can't visualize wet map correctly:

So I transorm Paint Device for visualizing wet map correctly (because it will be important for artists, I think). And now it looks like this:

Splat

My implementation is based on a procedural brush. Every brush stamp is a union of dynamic splats. Here you can see the behavior of splats:

Also I tested reweting (when splat go to flowing state from fixed state):

flowing_part_one_adding_water

And as a final test I made splat generator for simulating strokes:

generating_blue_25radius_sin_10secgenerating_red_100radius_sin_1sec

What next?

It's high time to get splats to work in Krita. So I'm going to finish my plugin, and test splats behavior. But it will be primitive:

  1. Clear canvas for updating splats
  2. No undo/redo
  3. Stupid singleton for splat storage


25 Jun 2017 3:33pm GMT

feedPlanet GNOME

Lucie Charvat: GSoC: Show Me More

Last time we spoke the documentation was temporarily implemented using the GtkTooltip, making it simple to show the snippet of the documentation but imposible to make it interactive. That's where the GtkPopover came in.

This widget provides options to mimic the behavior of the tool tip yet provides interface to customize the content to accommodate all the planned features. Now we can add more information from the documentation without taking up the entire screen.

Screen Shot 2017-06-25 at 15.29.04

Screen Shot 2017-06-25 at 15.29.07

The design aspect of the card is still bit lacking since I have still not committed to how the XML file with the documentation is going to be parsed. Balancing the fact that there is no need to analyze the entire file, yet some more knowledge of the structure would help with better ability to style the text.

The current implementation of documentation can be found here. Any feedback will be appreciated.

If the card doesn't want to show up, you might want to check the right panel if you actually have the documentation. I haven't found a centralized system for all the documentation but if you install a package *-doc the specific library it will be automatically added to your Builder through Devhelp, for instance:


25 Jun 2017 2:10pm GMT

feedPlanet KDE

The achievements during the first GSoC period in KStars

I was able to do major improvements to the build system of KStars. I think more and more open-source projects should pick up these low-hanging fruits with CMake and Clang:

- CCache: Speed-up the development time and working with git branches by caching the compiled object files.
- Unity Build: This simple "hack" can reduce the build time dramatically by compiling temporary C++ meta files in which the normal C++ files are included. The build time can be speeded up to 2x-3x for bigger projects.
- Clang Sanitizers: Use undefined behavior and address sanitizers to hunt down memory handling errors. Although the executable must be recompiled with Clang and special compiler flags, the resulted binary will run with minimal slowdown. It is not a complete replacement, but these sanitizers can catch most of the problems found by Valgrind during normal runtime.
- Clang Format: Format the source code with a real compiler engine.


More details are on our wiki page:
https://techbase.kde.org/Projects/Edu/KStars/C%2B%2B_developer_tools_with_KStars_on_Linux

25 Jun 2017 7:56am GMT

feedPlanet GNOME

Julita Inca: First Round Talks of Fedora + GNOME at UPN

Today our local group has traveled many miles to the north of Lima to present our lately work by using Fedora and GNOME as users and developers. Thanks to the organizers of the IT Forum to invite us and support our job as Linux volunteers and very nice potential contributors to GNOME and Fedora and the group we have formed in Lima, Peru.I has started with what Linux is, who have created it, why do they created it, why it is important, what Fedora is, what GNOME is, how did I get involved in GNOME and Fedora, how anyone can be involved, about the awesome GNOME and Fedora community

Then Solanch Ccasa, student System Engineering from UPIG, did share her experiences in the workshops she had helped me in the last year as well as her experiences in the GNOME Peru Challenge. Great job so far Solanch! 🙂Toto is another fantastic potential contributor to both projects is Toto. He is student of system Engineering from UNTELS and I trust him the general coordination of the GNOME Peru Challenge. Thank you Toto for all your effortFollowed by Toto, it was Leyla from UPN! Our outstanding designer and also python trained student in the GNOME Peru Challenge. She is System Engineering and she have organized many FLOSS events and she is so lovely! The hardworking Martin from UNMSM, was also there supporting us! He is an experiment developer and talent IT people, he is physics and also smart and funny person. Say hi to Martin Vuelta! 😉Alex from SENATI is another autodidact and inspired guy who is always helping us in this effort. He did a talk regarded on the terminal and commands that help developers in their daily 😀Felipe Moreno from UNI, a computer Science, is another very well skilled student that explain us GTK and his experiences with Go and IT related technologies with GNOME y Fedora. Grow up in FLOSS Felipe! 🙂Sheyla is also a student of mechatronic at UTP. She is a programmers and designer, she has been involved in the GNOME Peru Challenge lately and I hope she will fix a bug soon!Last but not least! Mario Antizana was showing his work with Mechatronics in UTP, and his experiences with KIA Motors and the Fedora project he has proposed and won! Awesome!The fee for the event had not charge, it was for free and we definitely prize our attendances

Screen Shot 2017-06-24 at 11.17.33 PM

Thanks to GOD first! for having the support of these talent and good people to build a Linux community that exert me to be a better leader and person for the sake of our group!Every experience we have as a group is a new satisfactory adventure, we enjoy this way… despite of ignorance, ridiculous and opposition! Thanks again guys! More pics as follow:Thanks again to the UPN (Private University of North) for everything!


Filed under: FEDORA, GNOME, τεχνολογια :: Technology Tagged: community, fedora, Fedora + GNOME community, FLOSS, FLOSS community, GNOME, GNOME 3.22, GNOME Peru Challenge, GNOME Peru Challenge 2017, Julita Inca, Julita Inca Chiroque, Lima, linux, Perú, UPN

25 Jun 2017 3:53am GMT

24 Jun 2017

feedPlanet GNOME

Yasitha Rajapaksha: GNOME Games : Progress so far

First up, I'd like to apologize for my last post on Planet GNOME. It was published on my blog months ago, where I use a 'magic' kind of theme. Well, for everyone's convenience I'd try my best to to write normal as possible from now on, even though I'm crazy.

Unfortunately, due to my prolonged end examinations, I had a late start to the coding period. However, I managed to complete the first few planned tasks successfully. Newly added (to gnome-games) desmume libretro core is working perfectly fine with Nintendo DS games. But don't get me wrong, this is just merely runs the game roms. Users couldn't actually play the games. This is because Nintendo DS has a touch screen and libretro core required to have touch support in order for any game to be playable. Since I was lacking a touch screen for testing, adding support for touch screens was held up a bit. Then as a start, mouse touch support has been added. This means, instead a touch screen, a mouse can be used to play the game. Basically this was done by attaching a mouse widget to the libretro core which translate itself as a touch screen handler.

Sounded like a simple solution right? True, but does this work up to expectations? Well the following screen capture will tell you why it doesn't.

ezgif.com-video-to-gif

Even though the inputs are detected, it is extremely hard to control the pointer. and mouse pointer is not gonna be in the same position as the touch pointer. This makes the game practically unplayable. Therefore, it would be better to handle the mouse to touch conversion ourselves, so that the user don't have to handle a secondary pointer handled by the core. For that, mouse events that come from the widget should be converted into libretro touch events. With this implemented, mouse pointer will really represent the finger or the touch pointer.

However, this is a bit of a complex task. Initial steps have been already taken and some modifications for retro-gtk parts have also been done. Hoping to finish this up within a couple of days. Stay tuned for the updates 🙂


24 Jun 2017 9:25pm GMT

23 Jun 2017

feedplanet.freedesktop.org

Jente Hidskes: GSoC part 5: exams

As mentioned in my previous blog, the X.Org Foundation now wants us to blog every week. Whilst that means shorter blogs (last week's was a tad long), it also means that there isn't much to blog about if I didn't do much in a week. Such is the case for this week, sadly. It's the last week of university, and so there are a few assignment deadlines that I needed to complete; I haven't been able to invest as much time into my project as I would have wanted.

23 Jun 2017 2:52pm GMT