G::Campax - Giovanni's tech blog

venerdì 20 ottobre 2017

An open, crowdsourced, privacy-preserving, virtual assistant (part 1)

Hello all,

I have been encouraged to blog more, and that is not hard because anything is more than nothing. So here I am, while waiting for some Mechanical Turk results, with part one of a series of posts in which I'll explain the work we've been doing in the Mobisocial Computing Laboratory here at Stanford.¹ ²

To put it quick, we're building a Virtual Assistant called Almond³. Think Siri or Alexa.

But 1) open 2) privacy-preserving and most importantly 3) way more powerful.

An Open Assistant

Alexa has 15,000 skills. If you can think of something IoT, Alexa supports it. If you can think of some major website or web service, Alexa supports it. Amazon has over 5,000 developers working on Alexa, and there is 4 students and one professor working on Almond. How do you compete with that?

In short, you don't. Instead, you open the platform up, and let people contribute. Amazon didn't write the 15,000 skills themselves: people did, when the Echo became popular. In a similar spirit, they didn't write all the cool NLP technology: they ran the Alexa Challenge, and let the best researchers compete.
Now, we don't have $ 1M to offer, but being a university, even if a private one, we're in an especially good position. So we decided to open source everything we do⁴.
You can take our software, our ideas, our contributions, and make something cool with it. Our hope is that as people find it and start playing with it, they'll also contribute back, and the platform will grow.

A Privacy-Preserving Assistant

If you build an open assistant, but don't care about privacy, the first thing people will do is to take it, fork it, and make it care about privacy. It's amazing the overlap between people who understand computers, the people who care about software freedom and the people who care about privacy. So we decided to do the work ourselves.
Our Assistant is built on two major principles to make it respect privacy:

Separate what is private from what is public: we collected the public information, such as the descriptions and glue code of the services to support, the datasets and machine learning models for semantic parsing, into an open repository called Thingpedia; everything else, such as your credentials, your data, your history of commands, stays in a private system called ThingSystem.
Let you run your own Almond: your ThingSystem contains your stuff only, so we let you run on the device you choose; there is no risk of privacy leaks, because the data never actually leaves the device. To make it easier to set up, we have a version of ThingSystem that works as a ready-to-install Android application (available on the Play Store); we also have a version for desktop computers (based on GNOME technologies)⁵ and for Echo-style home assistants.

The two sides of the world, Thingpedia and ThingSystem, communicate with a third Thing*: ThingTalk, which is in my opinion the coolest part of Almond⁶. This is a formal programming language (a domain specific language, if you're into that) that exactly describes what the assistant should do. When you give a command to Almond, this command is translated into a ThingTalk program; this ThingTalk program makes use of Thingpedia, like a regular Java program would use maven or a node program would use npm. Your ThingSystem just takes this program, loads the libraries it needs, and runs it.

A Way More Powerful Assistant

We're PhD students in a university, so unlike Amazon, Google, or even Red Hat and Canonical, building an useful open-source assistant that people will care about is not enough. You need to build something technologically new (or as they say here, “novel”).
Our work checks that bullet by letting people do things that no other assistant can do. Specifically, our assistant supports end user programming, which is a fancy way of saying that the user should decide what the assistant can do, and not the programmers.
Now, an end user will never “program” in the sense of writing an app a traditional language - it's not their job, they don't care, they want things done. Instead we interpret this to mean connecting things that already exists, in new and interesting ways. At the core, our ThingTalk programming language supports one single construct, when - get - do, which lets you specify that when something happens, you get some data and do something with it.
For example, you can generate a meme and then send it to your friends in one go; you can look at your location and turn off the lights if you're not home; you can reply to emails automatically, and even have a different reply for your boss, for your family and for scammers; you can send cute pictures to your SO automatically (I do that).
Every portion of the program is a high-level primitive, roughly corresponding to a single Alexa command, but now you can combine three of them in one sentence, meaning you can use stuff coming out from one command and feed it into the second. The result can also run on its own, so you do this rigmarole of setting up whens and gets and dos once, and then the assistant just does things for you. Additionally, you get to apply arbitrary predicates that limit when the action should run - like email filters, but for anything. Because this is a fully general system, any combination you can think of is allowed. You just say it in natural language, and Almond does it for you.
Now, this raises some technical challenges, starting from the fact that the natural language commands are a lot more complicated that what Alexa can understand. Alexa gets away with 15,000 skills because they force you to use one of a few magic words (“ask”, “tell”, “open”, “play”) followed by the exact name of the thing you want; after that, you only only have a few choices of commands. It would be totally uncool if we required you to do that: “Almond, connect GMail to GMail when I receive an email then send an email” just sounds weird, “Almond, reply to my emails automatically” sounds a lot more natural. I'll go into more details on semantic parsing in the next blog post in this series, but suffice to say, this is hard and nobody has a magic bullet yet.
Additionally, when - get - do on your own stuff in voice is nice, but the sentences get long and repetitive real quick, and the programming is quite limited still. Can we do better for every day repetitive tasks? Can we be more engaging than a monotone serial voice output? Can we let you share your stuff with other people, so they can operate on it from their Almonds? Can we let you put any sort of restrictions on who can touch your stuff, where, when and what they can do with it? Turns out we can, but this is the material for the next posts in this series. Stay tuned!

PS: I changed the title of the blog because sadly I don't do as much GNOME stuff as I used to. I don't do that much C++ either, but heh.

1 This is work done in collaboration with Rakesh Ramesh, Silei Xu and Michael Fischer, under the supervision of prof. Monica Lam.^↩
2 I should emphasize that the views expressed in this post are mine and do not reflect the view of Stanford, the Mobisocial Lab, or my colleagues.^↩
3 Yes, like the nut. It's because we're nuts.^↩
4 https://github.com/Stanford-Mobisocial-IoT-Lab.^↩
5 I should eventually turn around and package it as a flatpak. For now you need to build it from source. Help welcome!^↩
6 Also my thesis. But my judgement is totally unbiased, believe me.^↩

sabato 2 agosto 2014

Damn French, they ruined France

... but they did not ruin another great GUADEC!

As I promised at the beginning, I would do another blog post at the end, and this is it.
The first days were hectic as usual, with lots of great presentations, lighting talks and team reports for what others have been doing for the last past years. Many people already blogged on this topic, and video recordings will be out soon, so I would not go further on this.
As a special exception, though, go check out Christian Hergert's talk on GNOME Builder: he is awesome for quitting his job and deciding to investing so much of his own time and money into making our lives easier with better development tools.

Following the core days, came the BOFs. I attended the Release Team meeting, where a lot of process clean ups were approved, hopefully making the requirements for being "part of GNOME" (at the various levels) clearer and more transparent.
I also went to the GTK+ meeting, but lack of sleep from the previous days together with awesome Belgian beer that turns out to be French beer (also from the days before) turned me into a zombie background figure. Stay tuned for GTK+ 3.16 though, that's where all the fun (actors^H layers, a better list model, full wayland support, and more) will be!
Finally it's probably a good thing I did not attend the Privacy BOF, because it would have been quite embarrassing considering how poor is the privacy story with GNOME Weather is: through the search provider, we would send the stored locations to the upstream services (often in clear text, and often including the current location) every time a search was performed in the overview. This is obviously unacceptable, unless the user opts in, so the search provider will be disabled by default (when #734048 lands).

Speaking of GNOME Weather, if you follow the Summer of Code projects you may know that there is an intern working on a complete redesign on the app. He personally wasn't at GUADEC unfortunately, but I had some time to sit down with the ever awesome Allan Day to work out all the details.
The code is not in master (it will be when I run the final tests and reviews, but it's almost ready, and will surely be in 3.13.90), but I can show you a preview:

(you can see we've come a long way since the original announcement!)

This is all for now. I'd like to thank the GNOME Foundation for sponsoring me, and see you next year!

PS: the title is just a reference of a famous quote by Groundskeeper Willie from The Simpsons, and it's not meant to be in any way unfriendly to our transalpine neighbors

domenica 27 luglio 2014

There's no Guadec like this Guadec

...And that is true every year!

This is a very short post, because Guadec has barely started (the first of the core days was yesterday), but already we had the chance to go out and party togheter - which is Guadec is about, right?

More seriously, I'm really happy I had the chance to see all GNOME friends again, after last year in Brno, and I'd like to thank once again the Foundation for sponsoring me, even though I haven't been very active in the recent past.

On the technical side, I'm taking advance of this first break to start reviewing Saurabh's patches that are implementing the wonderful new design for GNOME Weather. Looking at the Shell, I had a nice conversation with Jasper, and the outcome was that in the short term, there won't be user visible changes (except of course for Carlos's work in the app view) - no new features but stability and bug fixes.

Again, this is a short post because we just started, I will do another blog post at the end. Stay tuned!

venerdì 11 aprile 2014

San Francisco Hackfest!

Hi everyone!

It's been a while since I last blogged, and I should probably even stop pretending I will blog more, but anyway...
Today is the third day of the GNOME Hackfest here at the Endless Mobile offices, in the startup neighborhood of San Francisco, in the hearth of the Silicon Valley - where stuff happens in technology, and free software is no exception.

So, what I have been doing?
On Wednesday we started with defining the agenda for the three days, and my friends already blogged about that.
I had the chance to meet with Kristian and the unstoppable Jasper to discuss wayland's xdg_surface, state changes and resizing - and that will mean we will bring an end to all flickering you see everyday in x11 when you maximize or resize a window. Not to mention we sorted out transients (popovers and tooltips) in a world where the application does not know its absolute position.
We also all toghether discussed application sandboxes, as Lennart and Kay were here and they were kind to explain how kdbus helps in making future applications secure.

Thursday morning on the other hand was gjs. Jasper, Cosimo, Colin and I sat down and landed a lot of GC improvements that will bring more responsiveness and less memory usage to your JS applications (as well as your favorite desktop shell). We introduced background sweeping for certain large objects such as byte arrays and cairo surfaces, we fixed cairo_region and GParamSpec bindings and we made sure the GC runs often.
In the afternoon we had a presentation by our host, Endless Mobile. We were shown what they're doing, and saw a prototype of their product (a BayTrail based desktop PC with a very stylish sort of oval case). I believe what they're trying to achieve is amazing, because really the emerging markets have billions of potential customers in a place where Windows doesn't matter, and will be great for GNOME and Free Software in general if they succeed.
After that Jim Nelson from Yorba and Daniel Foré from Elementary joined us for a long session on helping application developers, ISVs and other communities to use our platform. We talked IDEs, tooling, documentation, in preparation for the Developer Experience hackfest in Berlin. I won't be in Berlin (unfortunately it's exam week at my university), but it's been interesting to listen anway.

Besides work, I took some time on Tuesday to visit the city of San Francisco, walking all along the bay coast, and even had a chance to visit the university of Stanford, down in Palo Alto.
And for all of that I'm having a wonderful time, and I'd like to thank the GNOME Foundation and all the contributors for sponsoring me to come here.

giovedì 3 ottobre 2013

Every Frame Matters

Hello fellow readers, and welcome to an other installment of the least regular blog on the planet (or actually, not on the Planet anymore).

Last time, we were about to release 3.8, and this time, 3.10.0 is already out and we're working hard for 3.10.1, so today I want to talk about one of 3.10 features, that is, Wayland done in the GNOME way.

I worked hard on it during this summer, as part of my internship in Red Hat (which I'd like to thank once again for the opportunity), and others like Phoronix and Slashdot already covered it extensively, but what changed today is that finally all the bits are in place for wider testing on Fedora 20.

Once again, I'd like to point out that this is just a tech preview, and there are many huge regressions (listed in the 3.11 feature page). Some can be fixed using jhbuild and the wip/wayland-work branch, some are just not implemented yet, and some are bugs we don't know about. So try it, complain if it crashes, but don't expect to do any real work on it, and don't assume that the final wayland experience will be the same as now.

How to try it? First, you need an up to date GNOME 3.10 (gnome-shell >= 3.9.92-3.fc20), then you need the very latest X server (xorg-x11-server-Xorg >= 1.14.3-4.fc20, currently only in testing) and intel driver (xorg-x11-drv-intel >= 2.21.15-4.fc20, from updates-testing).
Then, there are two major modes now. The first one is nested inside an existing X11 session. From a virtual terminal, run "mutter-wayland --wayland".

Alternatively, you can run a full GNOME session in a different VT. Just go <Ctrl><Alt>F2 and run "gnome-session --session=gnome-wayland".
And this is what you get:

Doesn't look very different from a X11 GNOME session? Then I did my job well :)

To leave it, just log out from the menu. If you get stuck and can't logout (which can happen for some reason, probably a timeout issue in gnome-session), run "killall gnome-session gnome-shell-wayland" in a terminal.
Note that keybindings are not in 3.10, so VT switching only works if you do "sudo chvt" from a terminal.

More details on testing gnome can be found in the GNOME wiki

lunedì 18 marzo 2013

Under the shell of the developer

So, it's 3.7.92 time! Release notes are almost out, and if everything goes according to the plan, we'll be releasing our next stable version on March 27th.

As a member of the shell team, the big news is another successful round of Every Detail Matters. Go and see for yourself the eco-friendliness of that page: almost every line is green.
Among the many bugs, I'd like to highlight one, that has probably bugged each of you since 3.0: OSDs and global keybindings (screenshots, volume, input source, brightness) work in the overview, the screen lock and when a modal dialog is up.
Many thanks to Florian Müllner for implementing it!

About the rest, suffice to say that we tried to fix all the small annoyances and inconsistencies in the shell. And the release notes already include a very nice screenshot of them, so I won't steal the surprise until we're out.

Then on, the features side, it deserves a mention that we have a new application view, with frequently used apps and custom folders. I like it!

Going back to what I did, I already blogged on the most noticeable feature I worked on this cycle, notification filtering. But the awesome GNOME folks started patching all applications in this universe, so the panel looks a lot better now:

Then, it was a slow February, all exams out, I started hacking on Gjs. The result is an application framework that I will propose for 3.8. You can see a demo (which doubles as a template) at https://github.com/gcampax/gtk-js-app
But I needed a real application to validate what I was writing, and so GNOME Weather was born - again. And people started saw there was activity, I got a bugzilla product, and bam, magically I had patches from everywhere. Now say, isn't free software the best?
But wait no more, here is Weather 3.7.92 in all its glory.

Once again, thanks to Paolo Borelli, Cosimo Cecchi and William Jon McCann for all the help and code, and thanks to all flickr artists that, by choosing a free license, contributed to the success of this app.

So, what are you waiting for? Go grab GNOME 3.7.92!
Tarballs are at the usual location, and so are jhbuild and ostree. And I'm told the build server offers pre-built VM images, if you're into that.

sabato 1 dicembre 2012

Playing chase

Sometimes, you notice that writing an OS is like playing chase: sometimes you open the latest MS system, and go "Uhm... Where did I see that before?", and sometimes you proudly show the results of hours of work, and what you get is "Heh, everyone else did that ages ago!"

This is one of the latter moments. I hereby present you with the latest creation from the department of "It was about time!": Gnome Weather.

As you may guess, it is an app showing weather conditions for you location (like Win 8 does by default). It is still rough on the edges, as you may notice from the very pixelated icon, but it is there.
And I can't tell how much I love GNOME: I uploaded this four hours ago, and I already got Galician and Polish translations! People, you're simply great.

But this is not the only thing I've been working on lately. You may remember a find the difference Google+ post a while ago. For those of you who didn't care, and for those of you that still don't, what I wanted to highlight there were the finally fixed volume key handling, the headphone icon in the status bar, the modal dialogs in the overview and the new panel for configuring notifications.

Besides the global keybinding thing, which will be solved by Florian in a different way, all those things and many others are happening in GNOME 3.8.

So yeah, this was really because I was a long time since my last post, and because I felt particularly proud of what I achieved. Nothing special maybe, but I hope you enjoyed.

GNOME 3.8 is going to be the best winter release ever!

PS: to Jasper, re the linked post above: you really made my day with your comment :)
PS2: no it wasn't a modified background, it was the One And Only, except that noone ever sees that one in particular because it goes from midnight to 6 am...