31 Aug 2014

feedPlanet Python

Dave Behnke: Fun with SQLAlchemy

This is a little experiment I created with SQLAlchemy. In this notebook, I'm using sqlite to create a table, and doing some operations such as deleting all the rows in the table and inserting a list of items.

In [2]:
# connection is a connection to the database from a pool of connections
connection = engine.connect()
# meta will be used to reflect the table later
meta = MetaData()
2014-08-10 21:10:16,410 INFO sqlalchemy.engine.base.Engine SELECT CAST(&apostest plain returns&apos AS VARCHAR(60)) AS anon_1

INFO:sqlalchemy.engine.base.Engine:SELECT CAST(&apostest plain returns&apos AS VARCHAR(60)) AS anon_1

2014-08-10 21:10:16,411 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:16,413 INFO sqlalchemy.engine.base.Engine SELECT CAST(&apostest unicode returns&apos AS VARCHAR(60)) AS anon_1

INFO:sqlalchemy.engine.base.Engine:SELECT CAST(&apostest unicode returns&apos AS VARCHAR(60)) AS anon_1

2014-08-10 21:10:16,414 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

In [3]:
# create the table if it doesn't exist already
connection.execute("create table if not exists test ( id integer primary key autoincrement, name text )")
2014-08-10 21:10:17,212 INFO sqlalchemy.engine.base.Engine create table if not exists test ( id integer primary key autoincrement, name text )

INFO:sqlalchemy.engine.base.Engine:create table if not exists test ( id integer primary key autoincrement, name text )

2014-08-10 21:10:17,214 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,217 INFO sqlalchemy.engine.base.Engine COMMIT

INFO:sqlalchemy.engine.base.Engine:COMMIT

Out[3]:
<sqlalchemy.engine.result.ResultProxy at 0x105083e48>
In [4]:
#reflects all the tables in the current connection
meta.reflect(bind=engine)
2014-08-10 21:10:17,986 INFO sqlalchemy.engine.base.Engine SELECT name FROM  (SELECT * FROM sqlite_master UNION ALL   SELECT * FROM sqlite_temp_master) WHERE type=&apostable&apos ORDER BY name

INFO:sqlalchemy.engine.base.Engine:SELECT name FROM  (SELECT * FROM sqlite_master UNION ALL   SELECT * FROM sqlite_temp_master) WHERE type=&apostable&apos ORDER BY name

2014-08-10 21:10:17,987 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,989 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("sqlite_sequence")

INFO:sqlalchemy.engine.base.Engine:PRAGMA table_info("sqlite_sequence")

2014-08-10 21:10:17,990 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,993 INFO sqlalchemy.engine.base.Engine PRAGMA foreign_key_list("sqlite_sequence")

INFO:sqlalchemy.engine.base.Engine:PRAGMA foreign_key_list("sqlite_sequence")

2014-08-10 21:10:17,995 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,997 INFO sqlalchemy.engine.base.Engine PRAGMA index_list("sqlite_sequence")

INFO:sqlalchemy.engine.base.Engine:PRAGMA index_list("sqlite_sequence")

2014-08-10 21:10:17,997 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,999 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("test")

INFO:sqlalchemy.engine.base.Engine:PRAGMA table_info("test")

2014-08-10 21:10:18,000 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:18,001 INFO sqlalchemy.engine.base.Engine PRAGMA foreign_key_list("test")

INFO:sqlalchemy.engine.base.Engine:PRAGMA foreign_key_list("test")

2014-08-10 21:10:18,002 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:18,004 INFO sqlalchemy.engine.base.Engine PRAGMA index_list("test")

INFO:sqlalchemy.engine.base.Engine:PRAGMA index_list("test")

2014-08-10 21:10:18,005 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

In [5]:
# grabs a Table object from meta
test = meta.tables['test']
test
Out[5]:
Table(&apostest&apos, MetaData(bind=None), Column(&aposid&apos, INTEGER(), table=<test>, primary_key=True, nullable=False), Column(&aposname&apos, TEXT(), table=<test>), schema=None)
In [6]:
# cleans out all the rows in the test table
result = connection.execute(test.delete())
print("Deleted %d row(s)" % result.rowcount)
2014-08-10 21:10:22,659 INFO sqlalchemy.engine.base.Engine DELETE FROM test

INFO:sqlalchemy.engine.base.Engine:DELETE FROM test

2014-08-10 21:10:22,661 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:22,662 INFO sqlalchemy.engine.base.Engine COMMIT

INFO:sqlalchemy.engine.base.Engine:COMMIT

Deleted 11 row(s)

In [7]:
# create a list of names to be inserting into the test table
names = ['alpha', 'bravo', 'charlie', 'delta', 'epsilon', 'foxtrot', 'golf', 'hotel', 'india', 'juliet', 'lima']
In [8]:
# perform multiple inserts, the list is converted on the fly into a dictionary with the name field.
result = connection.execute(test.insert(), [{'name': name} for name in names])
print("Inserted %d row(s)" % result.rowcount)
2014-08-10 21:10:26,580 INFO sqlalchemy.engine.base.Engine INSERT INTO test (name) VALUES (?)

INFO:sqlalchemy.engine.base.Engine:INSERT INTO test (name) VALUES (?)

2014-08-10 21:10:26,582 INFO sqlalchemy.engine.base.Engine ((&aposalpha&apos,), (&aposbravo&apos,), (&aposcharlie&apos,), (&aposdelta&apos,), (&aposepsilon&apos,), (&aposfoxtrot&apos,), (&aposgolf&apos,), (&aposhotel&apos,)  ... displaying 10 of 11 total bound parameter sets ...  (&aposjuliet&apos,), (&aposlima&apos,))

INFO:sqlalchemy.engine.base.Engine:((&aposalpha&apos,), (&aposbravo&apos,), (&aposcharlie&apos,), (&aposdelta&apos,), (&aposepsilon&apos,), (&aposfoxtrot&apos,), (&aposgolf&apos,), (&aposhotel&apos,)  ... displaying 10 of 11 total bound parameter sets ...  (&aposjuliet&apos,), (&aposlima&apos,))

2014-08-10 21:10:26,583 INFO sqlalchemy.engine.base.Engine COMMIT

INFO:sqlalchemy.engine.base.Engine:COMMIT

Inserted 11 row(s)

In [9]:
# query the rows with select, the where clause is included for demostration
# it can be omitted
result = connection.execute(select([test]).where(test.c.id > 0)) 
2014-08-10 21:10:28,528 INFO sqlalchemy.engine.base.Engine SELECT test.id, test.name 
FROM test 
WHERE test.id > ?

INFO:sqlalchemy.engine.base.Engine:SELECT test.id, test.name 
FROM test 
WHERE test.id > ?

2014-08-10 21:10:28,529 INFO sqlalchemy.engine.base.Engine (0,)

INFO:sqlalchemy.engine.base.Engine:(0,)

In [10]:
# show the results
for row in result:
    print("id=%d, name=%s" % (row['id'], row['name']))
id=56, name=alpha
id=57, name=bravo
id=58, name=charlie
id=59, name=delta
id=60, name=epsilon
id=61, name=foxtrot
id=62, name=golf
id=63, name=hotel
id=64, name=india
id=65, name=juliet
id=66, name=lima

In []:

31 Aug 2014 12:30am GMT

Dave Behnke: Fun with SQLAlchemy

This is a little experiment I created with SQLAlchemy. In this notebook, I'm using sqlite to create a table, and doing some operations such as deleting all the rows in the table and inserting a list of items.

In [2]:
# connection is a connection to the database from a pool of connections
connection = engine.connect()
# meta will be used to reflect the table later
meta = MetaData()
2014-08-10 21:10:16,410 INFO sqlalchemy.engine.base.Engine SELECT CAST(&apostest plain returns&apos AS VARCHAR(60)) AS anon_1

INFO:sqlalchemy.engine.base.Engine:SELECT CAST(&apostest plain returns&apos AS VARCHAR(60)) AS anon_1

2014-08-10 21:10:16,411 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:16,413 INFO sqlalchemy.engine.base.Engine SELECT CAST(&apostest unicode returns&apos AS VARCHAR(60)) AS anon_1

INFO:sqlalchemy.engine.base.Engine:SELECT CAST(&apostest unicode returns&apos AS VARCHAR(60)) AS anon_1

2014-08-10 21:10:16,414 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

In [3]:
# create the table if it doesn't exist already
connection.execute("create table if not exists test ( id integer primary key autoincrement, name text )")
2014-08-10 21:10:17,212 INFO sqlalchemy.engine.base.Engine create table if not exists test ( id integer primary key autoincrement, name text )

INFO:sqlalchemy.engine.base.Engine:create table if not exists test ( id integer primary key autoincrement, name text )

2014-08-10 21:10:17,214 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,217 INFO sqlalchemy.engine.base.Engine COMMIT

INFO:sqlalchemy.engine.base.Engine:COMMIT

Out[3]:
<sqlalchemy.engine.result.ResultProxy at 0x105083e48>
In [4]:
#reflects all the tables in the current connection
meta.reflect(bind=engine)
2014-08-10 21:10:17,986 INFO sqlalchemy.engine.base.Engine SELECT name FROM  (SELECT * FROM sqlite_master UNION ALL   SELECT * FROM sqlite_temp_master) WHERE type=&apostable&apos ORDER BY name

INFO:sqlalchemy.engine.base.Engine:SELECT name FROM  (SELECT * FROM sqlite_master UNION ALL   SELECT * FROM sqlite_temp_master) WHERE type=&apostable&apos ORDER BY name

2014-08-10 21:10:17,987 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,989 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("sqlite_sequence")

INFO:sqlalchemy.engine.base.Engine:PRAGMA table_info("sqlite_sequence")

2014-08-10 21:10:17,990 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,993 INFO sqlalchemy.engine.base.Engine PRAGMA foreign_key_list("sqlite_sequence")

INFO:sqlalchemy.engine.base.Engine:PRAGMA foreign_key_list("sqlite_sequence")

2014-08-10 21:10:17,995 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,997 INFO sqlalchemy.engine.base.Engine PRAGMA index_list("sqlite_sequence")

INFO:sqlalchemy.engine.base.Engine:PRAGMA index_list("sqlite_sequence")

2014-08-10 21:10:17,997 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:17,999 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("test")

INFO:sqlalchemy.engine.base.Engine:PRAGMA table_info("test")

2014-08-10 21:10:18,000 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:18,001 INFO sqlalchemy.engine.base.Engine PRAGMA foreign_key_list("test")

INFO:sqlalchemy.engine.base.Engine:PRAGMA foreign_key_list("test")

2014-08-10 21:10:18,002 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:18,004 INFO sqlalchemy.engine.base.Engine PRAGMA index_list("test")

INFO:sqlalchemy.engine.base.Engine:PRAGMA index_list("test")

2014-08-10 21:10:18,005 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

In [5]:
# grabs a Table object from meta
test = meta.tables['test']
test
Out[5]:
Table(&apostest&apos, MetaData(bind=None), Column(&aposid&apos, INTEGER(), table=<test>, primary_key=True, nullable=False), Column(&aposname&apos, TEXT(), table=<test>), schema=None)
In [6]:
# cleans out all the rows in the test table
result = connection.execute(test.delete())
print("Deleted %d row(s)" % result.rowcount)
2014-08-10 21:10:22,659 INFO sqlalchemy.engine.base.Engine DELETE FROM test

INFO:sqlalchemy.engine.base.Engine:DELETE FROM test

2014-08-10 21:10:22,661 INFO sqlalchemy.engine.base.Engine ()

INFO:sqlalchemy.engine.base.Engine:()

2014-08-10 21:10:22,662 INFO sqlalchemy.engine.base.Engine COMMIT

INFO:sqlalchemy.engine.base.Engine:COMMIT

Deleted 11 row(s)

In [7]:
# create a list of names to be inserting into the test table
names = ['alpha', 'bravo', 'charlie', 'delta', 'epsilon', 'foxtrot', 'golf', 'hotel', 'india', 'juliet', 'lima']
In [8]:
# perform multiple inserts, the list is converted on the fly into a dictionary with the name field.
result = connection.execute(test.insert(), [{'name': name} for name in names])
print("Inserted %d row(s)" % result.rowcount)
2014-08-10 21:10:26,580 INFO sqlalchemy.engine.base.Engine INSERT INTO test (name) VALUES (?)

INFO:sqlalchemy.engine.base.Engine:INSERT INTO test (name) VALUES (?)

2014-08-10 21:10:26,582 INFO sqlalchemy.engine.base.Engine ((&aposalpha&apos,), (&aposbravo&apos,), (&aposcharlie&apos,), (&aposdelta&apos,), (&aposepsilon&apos,), (&aposfoxtrot&apos,), (&aposgolf&apos,), (&aposhotel&apos,)  ... displaying 10 of 11 total bound parameter sets ...  (&aposjuliet&apos,), (&aposlima&apos,))

INFO:sqlalchemy.engine.base.Engine:((&aposalpha&apos,), (&aposbravo&apos,), (&aposcharlie&apos,), (&aposdelta&apos,), (&aposepsilon&apos,), (&aposfoxtrot&apos,), (&aposgolf&apos,), (&aposhotel&apos,)  ... displaying 10 of 11 total bound parameter sets ...  (&aposjuliet&apos,), (&aposlima&apos,))

2014-08-10 21:10:26,583 INFO sqlalchemy.engine.base.Engine COMMIT

INFO:sqlalchemy.engine.base.Engine:COMMIT

Inserted 11 row(s)

In [9]:
# query the rows with select, the where clause is included for demostration
# it can be omitted
result = connection.execute(select([test]).where(test.c.id > 0)) 
2014-08-10 21:10:28,528 INFO sqlalchemy.engine.base.Engine SELECT test.id, test.name 
FROM test 
WHERE test.id > ?

INFO:sqlalchemy.engine.base.Engine:SELECT test.id, test.name 
FROM test 
WHERE test.id > ?

2014-08-10 21:10:28,529 INFO sqlalchemy.engine.base.Engine (0,)

INFO:sqlalchemy.engine.base.Engine:(0,)

In [10]:
# show the results
for row in result:
    print("id=%d, name=%s" % (row['id'], row['name']))
id=56, name=alpha
id=57, name=bravo
id=58, name=charlie
id=59, name=delta
id=60, name=epsilon
id=61, name=foxtrot
id=62, name=golf
id=63, name=hotel
id=64, name=india
id=65, name=juliet
id=66, name=lima

In []:

31 Aug 2014 12:30am GMT

30 Aug 2014

feedPlanet Python

Dave Behnke: Back to Python

After some "soul searching" and investigation between go and python over the last few months. I've decided to come back to Python.

My Experience

I spent a couple months researching and developing with Go. I even bought a pre-released book (Go in Action - http://www.manning.com/ketelsen/). The concurrency chapter wasn't written quite yet so I ended up looking elsewhere. I eventually found an excellent book explaining concurrency concepts of Go through my safari account (https://www.safaribooksonline.com/). The book is entitled Mastering Concurrency in Go. (https://www.packtpub.com/application-development/mastering-concurrency-go)

After going through a couple of programming exercises using Go, I started to think to myself, how would I do this in Python. It started to click in my brain that on a conceptional level a goroutine is similar to a async coroutine in Python. The main difference is that Go was designed from the beginning to be concurrent. Python it requires a little more work.

I'll make a long story short. Go is a good language, I will probably use it for specific problems to solve. I'm more familiar with Python. The passionate Python community makes me proud to be a part of it. I like having access to many interesting modules and packages. Ipython, Flask, Djano, Sqlalchemy just to name a few.

I look forward to continuing to work with Python and share code examples where I can. Stay tuned!

30 Aug 2014 8:38pm GMT

Dave Behnke: Back to Python

After some "soul searching" and investigation between go and python over the last few months. I've decided to come back to Python.

My Experience

I spent a couple months researching and developing with Go. I even bought a pre-released book (Go in Action - http://www.manning.com/ketelsen/). The concurrency chapter wasn't written quite yet so I ended up looking elsewhere. I eventually found an excellent book explaining concurrency concepts of Go through my safari account (https://www.safaribooksonline.com/). The book is entitled Mastering Concurrency in Go. (https://www.packtpub.com/application-development/mastering-concurrency-go)

After going through a couple of programming exercises using Go, I started to think to myself, how would I do this in Python. It started to click in my brain that on a conceptional level a goroutine is similar to a async coroutine in Python. The main difference is that Go was designed from the beginning to be concurrent. Python it requires a little more work.

I'll make a long story short. Go is a good language, I will probably use it for specific problems to solve. I'm more familiar with Python. The passionate Python community makes me proud to be a part of it. I like having access to many interesting modules and packages. Ipython, Flask, Djano, Sqlalchemy just to name a few.

I look forward to continuing to work with Python and share code examples where I can. Stay tuned!

30 Aug 2014 8:38pm GMT

Ian Ozsvald: Slides for High Performance Python tutorial at EuroSciPy2014 + Book signing!

Yesterday I taught an excerpt of my 2 day High Performance Python tutorial as a 1.5 hour hands-on lesson at EuroSciPy 2014 in Cambridge with 70 students:

IMG_20140828_155857

We covered profiling (down to line-by-line CPU & memory usage), Cython (pure-py and OpenMP with numpy), Pythran, PyPy and Numba. This is an abridged set of slides from my 2 day tutorial, take a look at those details for the upcoming courses (including an intro to data science) we're running in October.

I also got to do a book-signing for our High Performance Python book (co-authored with Micha Gorelick), O'Reilly sent us 20 galley copies to give away. The finished printed book will be available via O'Reilly and Amazon in the next few weeks.

Book signing at EuroSciPy 2014

If you want to hear about our future courses then join our low-volume training announce list. I have a short (no-signup) survey about training needs for Pythonistas in data science, please fill that in to help me figure out what we should be teaching.

I also have a further survey on how companies are using (or not using!) data science, I'll be using the results of this when I keynote at PyConIreland in October, your input will be very useful.

Here are the slides (License: CC By NonCommercial), there's also source on github:


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

30 Aug 2014 11:06am GMT

Ian Ozsvald: Slides for High Performance Python tutorial at EuroSciPy2014 + Book signing!

Yesterday I taught an excerpt of my 2 day High Performance Python tutorial as a 1.5 hour hands-on lesson at EuroSciPy 2014 in Cambridge with 70 students:

IMG_20140828_155857

We covered profiling (down to line-by-line CPU & memory usage), Cython (pure-py and OpenMP with numpy), Pythran, PyPy and Numba. This is an abridged set of slides from my 2 day tutorial, take a look at those details for the upcoming courses (including an intro to data science) we're running in October.

I also got to do a book-signing for our High Performance Python book (co-authored with Micha Gorelick), O'Reilly sent us 20 galley copies to give away. The finished printed book will be available via O'Reilly and Amazon in the next few weeks.

Book signing at EuroSciPy 2014

If you want to hear about our future courses then join our low-volume training announce list. I have a short (no-signup) survey about training needs for Pythonistas in data science, please fill that in to help me figure out what we should be teaching.

I also have a further survey on how companies are using (or not using!) data science, I'll be using the results of this when I keynote at PyConIreland in October, your input will be very useful.

Here are the slides (License: CC By NonCommercial), there's also source on github:


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

30 Aug 2014 11:06am GMT

29 Aug 2014

feedPlanet Python

Alec Munro: It's yer data! - how Google secured its future, and everyone else's

Dear Google,

This is a love letter and a call to action.

I believe we stand at a place where there is a unique opportunity in managing personal data.

There is a limited range of data types in the universe, and practically speaking, the vast majority of software works with a particularly tiny fraction of them.

People, for example. We know things about them.

Names, pictures of, people known, statements made, etc.

Tons of web applications conceive of these objects. Maybe not all, but probably most have some crossover. For many of the most trafficked apps, this personal data represents a very central currency. But unfortunately, up until now we've more or less been content with each app having it's own currency, that is not recognized elsewhere.

You can change that. You can establish a central, independent bank of data, owned by users and lent to applications in exchange for functionality. The format of the data itself will be defined and evolved by an independent agency of some sort.

There are two core things this will accomplish.

1) It will open up a whole new world of application development free from ties to you, Facebook, Twitter, etc.

2) It will give people back ownership of their data. They will be able to establish and evolve an online identity that carries forward as they change what applications they use.

Both of these have a dramatic impact on Google, as they allow you to do what you do best, building applications that work with large datasets, while at the same time freeing from you concerns that you are monopolizing people's data.

A new application world

When developing a new application, you start with an idea, and then you spend a lot of time defining a data model and the logic required to implement that idea on that data model. If you have any success with your application, you will need to invest further in your data model, fleshing it out, and implementing search, caching, and other optimizations.

In this new world, all you would do is include a library and point it at an existing data model. For the small fraction of data that was unique to your application, you could extend the existing model. For example:

from new_world import Model, Field

BaseUser = Model("https://new_world.org/users/1.0")

class OurUser(BaseUser):
our_field = Field("our_field", type=String)


That's it. No persistence (though you could set args somewhere to define how to synchronize), no search, no caching. Now you can get to actually building what makes your application great.

Conceivably, you can do it all in Javascript, other than identifying the application uniquely to the data store.

And you can be guaranteed data interoperability with Facebook, Google, etc. So if you make a photo editing app, you can edit photos uploaded with any of those, and they can display the photos that are edited.

Securing our future

People have good reason to be suspicious of Google, Facebook, or any other organization that is able to derive value through the "ownership" of their data. Regardless of the intent of the organization today, history has shown that profit is a very powerful motivator for bad behaviour, and these caches of personal data represent a store of potential profit that we all expect will at some point prove too tempting to avoid abusing.

Providing explicit ownership and license of said data via a third-party won't take away the temptation to abuse the data, but will make it more difficult in a number of ways:

A gooder, more-productive, Google

By putting people's data back in their hands, and merely borrowing it from them for specific applications, the opportunities for evil are dramatically reduced.

But what I think is even more compelling for Google here is that it will make you more productive. Internally, I believe you already operate similar to how I've described here, but you constantly bump up against limitations imposed by trying not to be evil. Without having to worry about the perceptions of how you are using people's data, what could you accomplish?

Conclusion

Google wants to do no evil. Facebook is perhaps less explicit, but from what I know of its culture, I believe it aspires to be competent enough that there's no need to exploit users data. The future will bring new leadership and changes in culture to both companies, but if they act soon, they can secure their moral aspirations and provide a great gift to the world.

(Interesting aside, Amazon's recently announced Cognito appears to be in some ways a relative of this idea, at least as a developer looking to build things. Check it out.)

29 Aug 2014 5:31pm GMT

Alec Munro: It's yer data! - how Google secured its future, and everyone else's

Dear Google,

This is a love letter and a call to action.

I believe we stand at a place where there is a unique opportunity in managing personal data.

There is a limited range of data types in the universe, and practically speaking, the vast majority of software works with a particularly tiny fraction of them.

People, for example. We know things about them.

Names, pictures of, people known, statements made, etc.

Tons of web applications conceive of these objects. Maybe not all, but probably most have some crossover. For many of the most trafficked apps, this personal data represents a very central currency. But unfortunately, up until now we've more or less been content with each app having it's own currency, that is not recognized elsewhere.

You can change that. You can establish a central, independent bank of data, owned by users and lent to applications in exchange for functionality. The format of the data itself will be defined and evolved by an independent agency of some sort.

There are two core things this will accomplish.

1) It will open up a whole new world of application development free from ties to you, Facebook, Twitter, etc.

2) It will give people back ownership of their data. They will be able to establish and evolve an online identity that carries forward as they change what applications they use.

Both of these have a dramatic impact on Google, as they allow you to do what you do best, building applications that work with large datasets, while at the same time freeing from you concerns that you are monopolizing people's data.

A new application world

When developing a new application, you start with an idea, and then you spend a lot of time defining a data model and the logic required to implement that idea on that data model. If you have any success with your application, you will need to invest further in your data model, fleshing it out, and implementing search, caching, and other optimizations.

In this new world, all you would do is include a library and point it at an existing data model. For the small fraction of data that was unique to your application, you could extend the existing model. For example:

from new_world import Model, Field

BaseUser = Model("https://new_world.org/users/1.0")

class OurUser(BaseUser):
our_field = Field("our_field", type=String)


That's it. No persistence (though you could set args somewhere to define how to synchronize), no search, no caching. Now you can get to actually building what makes your application great.

Conceivably, you can do it all in Javascript, other than identifying the application uniquely to the data store.

And you can be guaranteed data interoperability with Facebook, Google, etc. So if you make a photo editing app, you can edit photos uploaded with any of those, and they can display the photos that are edited.

Securing our future

People have good reason to be suspicious of Google, Facebook, or any other organization that is able to derive value through the "ownership" of their data. Regardless of the intent of the organization today, history has shown that profit is a very powerful motivator for bad behaviour, and these caches of personal data represent a store of potential profit that we all expect will at some point prove too tempting to avoid abusing.

Providing explicit ownership and license of said data via a third-party won't take away the temptation to abuse the data, but will make it more difficult in a number of ways:

A gooder, more-productive, Google

By putting people's data back in their hands, and merely borrowing it from them for specific applications, the opportunities for evil are dramatically reduced.

But what I think is even more compelling for Google here is that it will make you more productive. Internally, I believe you already operate similar to how I've described here, but you constantly bump up against limitations imposed by trying not to be evil. Without having to worry about the perceptions of how you are using people's data, what could you accomplish?

Conclusion

Google wants to do no evil. Facebook is perhaps less explicit, but from what I know of its culture, I believe it aspires to be competent enough that there's no need to exploit users data. The future will bring new leadership and changes in culture to both companies, but if they act soon, they can secure their moral aspirations and provide a great gift to the world.

(Interesting aside, Amazon's recently announced Cognito appears to be in some ways a relative of this idea, at least as a developer looking to build things. Check it out.)

29 Aug 2014 5:31pm GMT

28 Aug 2014

feedPlanet Python

Yann Larrivée: ConFoo is looking for speakers

ConFoo is currently looking for web professionals with deep understanding of PHP, Java, Ruby, Python, DotNet, HTML5, Databases, Cloud Computing, Security and Mobile development to share their skills and experience at the next ConFoo. Submit your proposals between August 25th and September 22nd.

ConFoo is a conference for developers that has built a reputation as a prime destination for exploring new technologies, diving deeper into familiar topics, and experiencing the best of community and culture.

If you would simply prefer to attend the conference, we have a $290 discount until October 13th.

28 Aug 2014 7:49pm GMT

Yann Larrivée: ConFoo is looking for speakers

ConFoo is currently looking for web professionals with deep understanding of PHP, Java, Ruby, Python, DotNet, HTML5, Databases, Cloud Computing, Security and Mobile development to share their skills and experience at the next ConFoo. Submit your proposals between August 25th and September 22nd.

ConFoo is a conference for developers that has built a reputation as a prime destination for exploring new technologies, diving deeper into familiar topics, and experiencing the best of community and culture.

If you would simply prefer to attend the conference, we have a $290 discount until October 13th.

28 Aug 2014 7:49pm GMT

Martijn Faassen: Morepath 0.5(.1) and friends released!

I've just released a whole slew things of things, the most important is Morepath 0.5, your friendly neighborhood Python web framework with superpowers!

What's new?

There are a a bunch of new things in the documentation, in particular:

Also available is @reg.classgeneric. This depends on a new feature in the Reg library.

There are a few bug fixes as well.

For more details, see the full changelog.

Morepath mailing list

I've documented how to get in touch with the Morepath community. In particular, there's a new Morepath mailing list!

Please do get in touch!

Other releases

I've also released:

  • Reg 0.8. This is the generic function library behind some of Morepath's flexibility and power.
  • BowerStatic 0.3. This is a WSGI framework for including static resources in HTML pages automatically, using components installed with Bower.
  • more.static 0.2. This is a little library integrating BowerStatic with Morepath.

Morepath videos!

You may have noticed I linked to Morepath 0.5.1 before, not Morepath 0.5. This is because I had to as I was using a new youtube extension that gave me a bit too much on readthedocs. I replaced that with raw HTML, which works better. The Morepath docs now include two videos.

  • On the homepage is my talk about Morepath at EuroPython 2014 in July. It's a relatively short talk, and gives a good idea on what makes Morepath different.
  • If you're interested in the genesis and history behind Morepath, and general ideas on what it means to be a creative developer, you can find another, longer, video on the Morepath history page. This was taken last year at PyCon DE, where I had the privilege to be invited to give a keynote speech.

28 Aug 2014 5:10pm GMT

Martijn Faassen: Morepath 0.5(.1) and friends released!

I've just released a whole slew things of things, the most important is Morepath 0.5, your friendly neighborhood Python web framework with superpowers!

What's new?

There are a a bunch of new things in the documentation, in particular:

Also available is @reg.classgeneric. This depends on a new feature in the Reg library.

There are a few bug fixes as well.

For more details, see the full changelog.

Morepath mailing list

I've documented how to get in touch with the Morepath community. In particular, there's a new Morepath mailing list!

Please do get in touch!

Other releases

I've also released:

  • Reg 0.8. This is the generic function library behind some of Morepath's flexibility and power.
  • BowerStatic 0.3. This is a WSGI framework for including static resources in HTML pages automatically, using components installed with Bower.
  • more.static 0.2. This is a little library integrating BowerStatic with Morepath.

Morepath videos!

You may have noticed I linked to Morepath 0.5.1 before, not Morepath 0.5. This is because I had to as I was using a new youtube extension that gave me a bit too much on readthedocs. I replaced that with raw HTML, which works better. The Morepath docs now include two videos.

  • On the homepage is my talk about Morepath at EuroPython 2014 in July. It's a relatively short talk, and gives a good idea on what makes Morepath different.
  • If you're interested in the genesis and history behind Morepath, and general ideas on what it means to be a creative developer, you can find another, longer, video on the Morepath history page. This was taken last year at PyCon DE, where I had the privilege to be invited to give a keynote speech.

28 Aug 2014 5:10pm GMT

Catalin George Festila: python book from O'Reilly Media - Save 50% .

Save 50% from O'Reilly Media.
The main goal it's to help you with the best possible performance in your Python applications.
See this book Python High Performance Programming.

28 Aug 2014 10:45am GMT

Catalin George Festila: python book from O'Reilly Media - Save 50% .

Save 50% from O'Reilly Media.
The main goal it's to help you with the best possible performance in your Python applications.
See this book Python High Performance Programming.

28 Aug 2014 10:45am GMT

Ian Ozsvald: High Performance Python Training at EuroSciPy this afternoon

I'm training on High Performance Python this afternoon at EuroSciPy, my github source is here (as a shortlink: http://bit.ly/euroscipy2014hpc). There are prerequisites for the course.

This training is actually a tiny part of what I'll teach on my 2 day High Performance Python course in London in October (along with a Data Science course). If you're at EuroSciPy, please say Hi :-)


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

28 Aug 2014 10:18am GMT

Ian Ozsvald: High Performance Python Training at EuroSciPy this afternoon

I'm training on High Performance Python this afternoon at EuroSciPy, my github source is here (as a shortlink: http://bit.ly/euroscipy2014hpc). There are prerequisites for the course.

This training is actually a tiny part of what I'll teach on my 2 day High Performance Python course in London in October (along with a Data Science course). If you're at EuroSciPy, please say Hi :-)


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

28 Aug 2014 10:18am GMT

Richard Jones: When testing goes bad

I've recently started working on a large, mature code base (some 65,000 lines of Python code). It has 1048 unit tests implemented in the standard unittest.TestCase fashion using the mox framework for mocking support (I'm not surprised you've not heard of it).

Recently I fixed a bug which was causing a user interface panel to display when it shouldn't have been. The fix basically amounts to a couple of lines of code added to the panel in question:

+    def can_access(self, context):
+        # extend basic permission-based check with a check to see whether 
+        # the Aggregates extension is even enabled in nova 
+        if not nova.extension_supported('Aggregates', context['request']):
+            return False
+        return super(Aggregates, self).can_access(context)

When I ran the unit test suite I discovered to my horror that 498 of the 1048 tests now failed. The reason for this is that the can_access() method here is called as a side-effect of those 498 tests and the nova.extension_supported (which is a REST call under the hood) needed to be mocked correctly to support it being called.

I quickly discovered that given the size of the test suite, and the testing tools used, each of those 498 tests must be fixed by hand, one at a time (if I'm lucky, some of them can be knocked off two at a time).

The main cause is mox's mocking of callables like the one above which enforces the order that those callables are invoked. It also enforces that the calls are made at all (uncalled mocks are treated as test failures).

This means there is no possibility to provide a blanket mock for the "nova.extension_supported". Tests with existing calls to that API need careful attention to ensure the ordering is correct. Tests which don't result in the side- effect call to the above method will raise an error, so even adding a mock setup in a TestCase.setUp() doesn't work in most cases.

It doesn't help that the codebase is so large, and has been developed by so many people over years. Mocking isn't consistently implemented; even the basic structure of tests in TestCases is inconsistent.

It's worth noting that the ordering check that mox provides is never used as far as I can tell in this codebase. I haven't sighted an example of multiple calls to the same mocked API without the additional use of the mox InAnyOrder() modifier. mox does not provide a mechanism to turn the ordering check off completely.

The pretend library (my go-to for stubbing) splits out the mocking step and the verification of calls so the ordering will only be enforced if you deem it absolutely necessary.

The choice to use unittest-style TestCase classes makes managing fixtures much more difficult (it becomes a nightmare of classes and mixins and setUp() super() calls or alternatively a nightmare of mixing classes and multiple explicit setup calls in test bodies). This is exacerbated by the test suite in question introducing its own mock-generating decorator which will generate a mock, but again leaves the implementation of the mocking to the test cases. py.test's fixtures are a far superior mechanism for managing mocking fixtures, allowing far simpler centralisation of the mocks and overriding of them through fixture dependencies.

The result is that I spent some time working through some of the test suite and discovered that in an afternoon I could fix about 10% of the failing tests. I have decided that spending a week fixing the tests for my 5 line bug fix is just not worth it, and I've withdrawn the patch.

28 Aug 2014 8:07am GMT

Richard Jones: When testing goes bad

I've recently started working on a large, mature code base (some 65,000 lines of Python code). It has 1048 unit tests implemented in the standard unittest.TestCase fashion using the mox framework for mocking support (I'm not surprised you've not heard of it).

Recently I fixed a bug which was causing a user interface panel to display when it shouldn't have been. The fix basically amounts to a couple of lines of code added to the panel in question:

+    def can_access(self, context):
+        # extend basic permission-based check with a check to see whether 
+        # the Aggregates extension is even enabled in nova 
+        if not nova.extension_supported('Aggregates', context['request']):
+            return False
+        return super(Aggregates, self).can_access(context)

When I ran the unit test suite I discovered to my horror that 498 of the 1048 tests now failed. The reason for this is that the can_access() method here is called as a side-effect of those 498 tests and the nova.extension_supported (which is a REST call under the hood) needed to be mocked correctly to support it being called.

I quickly discovered that given the size of the test suite, and the testing tools used, each of those 498 tests must be fixed by hand, one at a time (if I'm lucky, some of them can be knocked off two at a time).

The main cause is mox's mocking of callables like the one above which enforces the order that those callables are invoked. It also enforces that the calls are made at all (uncalled mocks are treated as test failures).

This means there is no possibility to provide a blanket mock for the "nova.extension_supported". Tests with existing calls to that API need careful attention to ensure the ordering is correct. Tests which don't result in the side- effect call to the above method will raise an error, so even adding a mock setup in a TestCase.setUp() doesn't work in most cases.

It doesn't help that the codebase is so large, and has been developed by so many people over years. Mocking isn't consistently implemented; even the basic structure of tests in TestCases is inconsistent.

It's worth noting that the ordering check that mox provides is never used as far as I can tell in this codebase. I haven't sighted an example of multiple calls to the same mocked API without the additional use of the mox InAnyOrder() modifier. mox does not provide a mechanism to turn the ordering check off completely.

The pretend library (my go-to for stubbing) splits out the mocking step and the verification of calls so the ordering will only be enforced if you deem it absolutely necessary.

The choice to use unittest-style TestCase classes makes managing fixtures much more difficult (it becomes a nightmare of classes and mixins and setUp() super() calls or alternatively a nightmare of mixing classes and multiple explicit setup calls in test bodies). This is exacerbated by the test suite in question introducing its own mock-generating decorator which will generate a mock, but again leaves the implementation of the mocking to the test cases. py.test's fixtures are a far superior mechanism for managing mocking fixtures, allowing far simpler centralisation of the mocks and overriding of them through fixture dependencies.

The result is that I spent some time working through some of the test suite and discovered that in an afternoon I could fix about 10% of the failing tests. I have decided that spending a week fixing the tests for my 5 line bug fix is just not worth it, and I've withdrawn the patch.

28 Aug 2014 8:07am GMT

27 Aug 2014

feedPlanet Python

Mike Driscoll: wxPython: Converting wx.DateTime to Python datetime

The wxPython GUI toolkit includes its own date / time capabilities. Most of the time, you can just use Python's datetime and time modules and you'll be fine. But occasionally you'll find yourself needing to convert from wxPython's wx.DateTime objects to Python's datetime objects. You may encounter this when you use the wx.DatePickerCtrl widget.

Fortunately, wxPython's calendar module has some helper functions that can help you convert datetime objects back and forth between wxPython and Python. Let's take a look:

def _pydate2wxdate(date):
     import datetime
     assert isinstance(date, (datetime.datetime, datetime.date))
     tt = date.timetuple()
     dmy = (tt[2], tt[1]-1, tt[0])
     return wx.DateTimeFromDMY(*dmy)
 
def _wxdate2pydate(date):
     import datetime
     assert isinstance(date, wx.DateTime)
     if date.IsValid():
          ymd = map(int, date.FormatISODate().split('-'))
          return datetime.date(*ymd)
     else:
          return None

You can use these handy functions in your own code to help with your conversions. I would probably put these into a controller or utilities script. I would also rewrite it slightly so I wouldn't import Python's datetime module inside the functions. Here's an example:

import datetime
import wx
 
def pydate2wxdate(date):
     assert isinstance(date, (datetime.datetime, datetime.date))
     tt = date.timetuple()
     dmy = (tt[2], tt[1]-1, tt[0])
     return wx.DateTimeFromDMY(*dmy)
 
def wxdate2pydate(date):
     assert isinstance(date, wx.DateTime)
     if date.IsValid():
          ymd = map(int, date.FormatISODate().split('-'))
          return datetime.date(*ymd)
     else:
          return None

You can read more about this topic on this old wxPython mailing thread. Have fun and happy coding!

27 Aug 2014 5:15pm GMT

Mike Driscoll: wxPython: Converting wx.DateTime to Python datetime

The wxPython GUI toolkit includes its own date / time capabilities. Most of the time, you can just use Python's datetime and time modules and you'll be fine. But occasionally you'll find yourself needing to convert from wxPython's wx.DateTime objects to Python's datetime objects. You may encounter this when you use the wx.DatePickerCtrl widget.

Fortunately, wxPython's calendar module has some helper functions that can help you convert datetime objects back and forth between wxPython and Python. Let's take a look:

def _pydate2wxdate(date):
     import datetime
     assert isinstance(date, (datetime.datetime, datetime.date))
     tt = date.timetuple()
     dmy = (tt[2], tt[1]-1, tt[0])
     return wx.DateTimeFromDMY(*dmy)
 
def _wxdate2pydate(date):
     import datetime
     assert isinstance(date, wx.DateTime)
     if date.IsValid():
          ymd = map(int, date.FormatISODate().split('-'))
          return datetime.date(*ymd)
     else:
          return None

You can use these handy functions in your own code to help with your conversions. I would probably put these into a controller or utilities script. I would also rewrite it slightly so I wouldn't import Python's datetime module inside the functions. Here's an example:

import datetime
import wx
 
def pydate2wxdate(date):
     assert isinstance(date, (datetime.datetime, datetime.date))
     tt = date.timetuple()
     dmy = (tt[2], tt[1]-1, tt[0])
     return wx.DateTimeFromDMY(*dmy)
 
def wxdate2pydate(date):
     assert isinstance(date, wx.DateTime)
     if date.IsValid():
          ymd = map(int, date.FormatISODate().split('-'))
          return datetime.date(*ymd)
     else:
          return None

You can read more about this topic on this old wxPython mailing thread. Have fun and happy coding!

27 Aug 2014 5:15pm GMT

eGenix.com: eGenix PyRun - One file Python Runtime 2.0.1 GA

Introduction

eGenix PyRun is our open source, one file, no installation version of Python, making the distribution of a Python interpreter to run based scripts and applications to Unix based systems as simple as copying a single file.

eGenix PyRun's executable only needs 11MB for Python 2 and 13MB for Python 3, but still supports most Python application and scripts - and it can be compressed to just 3-4MB using upx, if needed.

Compared to a regular Python installation of typically 100MB on disk, eGenix PyRun is ideal for applications and scripts that need to be distributed to several target machines, client installations or customers.

It makes "installing" Python on a Unix based system as simple as copying a single file.

eGenix has been using the product internally in the mxODBC Connect Server since 2008 with great success and decided to make it available as a stand-alone open-source product.

We provide both the source archive to build your own eGenix PyRun, as well as pre-compiled binaries for Linux, FreeBSD and Mac OS X, as 32- and 64-bit versions. The binaries can be downloaded manually, or you can let our automatic install script install-pyrun take care of the installation: ./install-pyrun dir and you're done.

Please see the product page for more details:

>>> eGenix PyRun - One file Python Runtime

News

This is a patch level release of eGenix PyRun 2.0. The major new feature in 2.0 is the added Python 3.4 support.

New Features

Enhancements / Changes

install-pyrun Quick Install Enhancements

eGenix PyRun includes a shell script called install-pyrun, which greatly simplifies installation of PyRun. It works much like the virtualenv shell script used for creating new virtual environments (except that there's nothing virtual about PyRun environments).

With the script, an eGenix PyRun installation is as simple as running:

./install-pyrun targetdir

This will automatically detect the platform, download and install the right pyrun version into targetdir.

We have updated this script since the last release:

For a complete list of changes, please see the eGenix PyRun Changelog.

Please see the eGenix PyRun 2.0.0 announcement for more details about eGenix PyRun 2.0.

Downloads

Please visit the eGenix PyRun product page for downloads, instructions on installation and documentation of the product.

More Information

For more information on eGenix PyRun, licensing and download instructions, please write to sales@egenix.com.

Enjoy !

Marc-Andre Lemburg, eGenix.com

27 Aug 2014 7:00am GMT

eGenix.com: eGenix PyRun - One file Python Runtime 2.0.1 GA

Introduction

eGenix PyRun is our open source, one file, no installation version of Python, making the distribution of a Python interpreter to run based scripts and applications to Unix based systems as simple as copying a single file.

eGenix PyRun's executable only needs 11MB for Python 2 and 13MB for Python 3, but still supports most Python application and scripts - and it can be compressed to just 3-4MB using upx, if needed.

Compared to a regular Python installation of typically 100MB on disk, eGenix PyRun is ideal for applications and scripts that need to be distributed to several target machines, client installations or customers.

It makes "installing" Python on a Unix based system as simple as copying a single file.

eGenix has been using the product internally in the mxODBC Connect Server since 2008 with great success and decided to make it available as a stand-alone open-source product.

We provide both the source archive to build your own eGenix PyRun, as well as pre-compiled binaries for Linux, FreeBSD and Mac OS X, as 32- and 64-bit versions. The binaries can be downloaded manually, or you can let our automatic install script install-pyrun take care of the installation: ./install-pyrun dir and you're done.

Please see the product page for more details:

>>> eGenix PyRun - One file Python Runtime

News

This is a patch level release of eGenix PyRun 2.0. The major new feature in 2.0 is the added Python 3.4 support.

New Features

Enhancements / Changes

install-pyrun Quick Install Enhancements

eGenix PyRun includes a shell script called install-pyrun, which greatly simplifies installation of PyRun. It works much like the virtualenv shell script used for creating new virtual environments (except that there's nothing virtual about PyRun environments).

With the script, an eGenix PyRun installation is as simple as running:

./install-pyrun targetdir

This will automatically detect the platform, download and install the right pyrun version into targetdir.

We have updated this script since the last release:

For a complete list of changes, please see the eGenix PyRun Changelog.

Please see the eGenix PyRun 2.0.0 announcement for more details about eGenix PyRun 2.0.

Downloads

Please visit the eGenix PyRun product page for downloads, instructions on installation and documentation of the product.

More Information

For more information on eGenix PyRun, licensing and download instructions, please write to sales@egenix.com.

Enjoy !

Marc-Andre Lemburg, eGenix.com

27 Aug 2014 7:00am GMT

Anatoly Techtonik: How to make RAM disk in Linux

UPDATE (2014-08-27): Exactly three years later I discovered that Linux already comes with RAM disk enabled by default, mounted as `/dev/shm` (which points to `/run/shm` on Debian/Ubuntu):
$ df -h /dev/shm
Filesystem Size Used Avail Use% Mounted on
tmpfs 75M 4.0K 75M 1% /run/shm
See detailed info here.


*RAM disk* is a term from the past when DOS was alive and information was stored on disks instead of internet. If you created image of some disk, it was possible to load it into memory. Memory disks were useful to load software from Live CDs. Usually software needs some space to write data during boot sequence, and RAM is the fastest way to setup one.

Filesystem space in memory can be extremely useful today too. For example, to run tests without reducing resource of SSD. While the idea is not new, there was no incentive to explore it until I've run upon tmpfs reference in Ubuntu Wiki.

For example, to get 2Gb of space for files in RAM, edit /etc/fstab to add the following line:
tmpfs     /var/ramspace       tmpfs     defaults,size=2048M     0     0
/var/ramspace is now the place to store your files in memory.

27 Aug 2014 12:36am GMT

Anatoly Techtonik: How to make RAM disk in Linux

UPDATE (2014-08-27): Exactly three years later I discovered that Linux already comes with RAM disk enabled by default, mounted as `/dev/shm` (which points to `/run/shm` on Debian/Ubuntu):
$ df -h /dev/shm
Filesystem Size Used Avail Use% Mounted on
tmpfs 75M 4.0K 75M 1% /run/shm
See detailed info here.


*RAM disk* is a term from the past when DOS was alive and information was stored on disks instead of internet. If you created image of some disk, it was possible to load it into memory. Memory disks were useful to load software from Live CDs. Usually software needs some space to write data during boot sequence, and RAM is the fastest way to setup one.

Filesystem space in memory can be extremely useful today too. For example, to run tests without reducing resource of SSD. While the idea is not new, there was no incentive to explore it until I've run upon tmpfs reference in Ubuntu Wiki.

For example, to get 2Gb of space for files in RAM, edit /etc/fstab to add the following line:
tmpfs     /var/ramspace       tmpfs     defaults,size=2048M     0     0
/var/ramspace is now the place to store your files in memory.

27 Aug 2014 12:36am GMT

26 Aug 2014

feedPlanet Python

Mike C. Fletcher: Python-dbus needs some non-trivial examples

So I got tired of paying work this afternoon and decided I would work on getting a dbus service started for Listener. The idea here is that there will be a DBus service which does all the context management, microphone setup, playback, etc and client software (such as the main GUI and apps that want to allow voice coding without going through low-level-grotty simulated typing) can use to interact with it.

But how does one go about exposing objects on DBus in the DBus-ian way? It *seems* that object-paths should produce a REST-like hierarchy where each object I want to expose is presented at /com/vrplumber/listener/context/... but should that be done on-demand? If I have 20 contexts, should I expose them all at start-up, or should the user "request" them one at a time (get_context( key ) -> path?). Should I use a ObjectTree? How do I handle deletion/de-registration in such a way that clients are notified of the removed objects? I can hack these things in, but it would be nice to know the *right* way to do this kind of work. Should I expose functions that process directories (import this directory), or only those which process in-memory data-sets (add these words to the dictionary), can (python) DBus handle many MBs of data? What does a proper "real" DBus service look like?

So, anyone know of some good examples of python-dbus services exposing non-trivial services? Many objects, many methods, object life-cycle operations, many signals, yada, yada?

(BTW, my hacks are up on github if anyone cares to hit me with a clue-stick).

26 Aug 2014 8:55pm GMT

Mike C. Fletcher: Python-dbus needs some non-trivial examples

So I got tired of paying work this afternoon and decided I would work on getting a dbus service started for Listener. The idea here is that there will be a DBus service which does all the context management, microphone setup, playback, etc and client software (such as the main GUI and apps that want to allow voice coding without going through low-level-grotty simulated typing) can use to interact with it.

But how does one go about exposing objects on DBus in the DBus-ian way? It *seems* that object-paths should produce a REST-like hierarchy where each object I want to expose is presented at /com/vrplumber/listener/context/... but should that be done on-demand? If I have 20 contexts, should I expose them all at start-up, or should the user "request" them one at a time (get_context( key ) -> path?). Should I use a ObjectTree? How do I handle deletion/de-registration in such a way that clients are notified of the removed objects? I can hack these things in, but it would be nice to know the *right* way to do this kind of work. Should I expose functions that process directories (import this directory), or only those which process in-memory data-sets (add these words to the dictionary), can (python) DBus handle many MBs of data? What does a proper "real" DBus service look like?

So, anyone know of some good examples of python-dbus services exposing non-trivial services? Many objects, many methods, object life-cycle operations, many signals, yada, yada?

(BTW, my hacks are up on github if anyone cares to hit me with a clue-stick).

26 Aug 2014 8:55pm GMT

Ian Ozsvald: Why are technical companies not using data science?

Here's a quick question. How come more technical companies aren't making use of data science? By "technical" I mean any company with data and the smarts to spot that it has value, by "data science" I mean any technical means to exploit this data for financial gain (e.g. visualisation to guide decisions, machine learning, prediction).

I'm guessing that it comes down to an economic question - either it isn't as valuable as some other activity (making mobile apps? improving UX on the website? paid marketing? expanding sales to new territories?) or it is perceived as being valuable but cannot be exploited (maybe due to lack of skills and training or data problems).

I'm thinking about this for my upcoming keynote at PyConIreland, would you please give me some feedback in the survey below (no sign-up required)?

To be clear - this is an anonymous survey, I'll have no idea who gives the answers.

Create your free online surveys with SurveyMonkey , the world's leading questionnaire tool.

If the above is interesting then note that we've got a data science training list where we make occasional announcements about our upcoming training and we have two upcoming training courses. We also discuss these topics at our PyDataLondon meetups. I also have a slightly longer survey (it'll take you 2 minutes, no sign-up required), I'll be discussing these results at the next PyDataLondon so please share your thoughts.


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

26 Aug 2014 8:35pm GMT

Ian Ozsvald: Why are technical companies not using data science?

Here's a quick question. How come more technical companies aren't making use of data science? By "technical" I mean any company with data and the smarts to spot that it has value, by "data science" I mean any technical means to exploit this data for financial gain (e.g. visualisation to guide decisions, machine learning, prediction).

I'm guessing that it comes down to an economic question - either it isn't as valuable as some other activity (making mobile apps? improving UX on the website? paid marketing? expanding sales to new territories?) or it is perceived as being valuable but cannot be exploited (maybe due to lack of skills and training or data problems).

I'm thinking about this for my upcoming keynote at PyConIreland, would you please give me some feedback in the survey below (no sign-up required)?

To be clear - this is an anonymous survey, I'll have no idea who gives the answers.

Create your free online surveys with SurveyMonkey , the world's leading questionnaire tool.

If the above is interesting then note that we've got a data science training list where we make occasional announcements about our upcoming training and we have two upcoming training courses. We also discuss these topics at our PyDataLondon meetups. I also have a slightly longer survey (it'll take you 2 minutes, no sign-up required), I'll be discussing these results at the next PyDataLondon so please share your thoughts.


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

26 Aug 2014 8:35pm GMT

Fabio Zadrozny: PyDev 3.7.0, PyDev/PyCharm Debugger merge, Crowdfunding

PyDev 3.7.0 was just released.

There are some interesting things to talk about in this release...

The first is that the PyDev debugger was merged with the fork which was used in PyCharm. The final code for the debugger (and the interactive console) now lives at: https://github.com/fabioz/PyDev.Debugger. This effort was backed-up by Intellij, and from now on, work on the debugger from either front (PyDev or PyCharm) should benefit both -- pull requests are also very welcome :)

With this merge, PyDev users will gain GEvent debugging and breakpoints at Django templates (but note that the breakpoints can only be added through the LiClipse HTML/Django Templates editor), and in the interactive console front (which was also part of this merge), the asynchronous output and console interrupt are new.

This release also changed the default UI for the PyDev editor (and for LiClipse editors too), so, the minimap (which had a bunch of enhancements) is now turned on by default and the scrollbars are hidden by default -- those that prefer the old behavior must change the settings on the minimap preferences to match the old style.

Also noteworthy is that the code-completion for all letter chars is turned on by default (again, users that want the old behavior have to uncheck that setting from the code completion preferences page), and this release also has a bunch of bugfixes.

Now, I haven't talked about the crowdfunding for keeping the support on PyDev and a new profiler UI (https://sw-brainwy.rhcloud.com/support/pydev-2014) after it finished... well, it didn't reach its full goal -- in practice that means the profiler UI will still be done and users which supported it will receive a license to use it, but it won't be open source... all in all, it wasn't that bad either, it got halfway through its target and many people seemed to like the idea -- in the end I'll know if keeping it on even if not having the full target reached was a good idea or not only after it's commercially available (as the idea is that new licenses will be what will cover for its development expenses and will keep it going afterwards).

A note for profiler contributors is that I still haven't released an early-release version, but I'm working on it :)

As for PyDev, the outcome of the funding also means I have fewer resources to support it than I'd like, but given that LiClipse (http://brainwy.github.io/liclipse/) provides a share of its earnings to support PyDev and latecomers can still contribute through http://pydev.org/, I still hope that I won't need to lower its support (which'd mean taking on other projects in the time I currently have for PyDev), and I think it'll still be possible to do the things outlined in the crowdfunding regarding it.




26 Aug 2014 6:22pm GMT

Fabio Zadrozny: PyDev 3.7.0, PyDev/PyCharm Debugger merge, Crowdfunding

PyDev 3.7.0 was just released.

There are some interesting things to talk about in this release...

The first is that the PyDev debugger was merged with the fork which was used in PyCharm. The final code for the debugger (and the interactive console) now lives at: https://github.com/fabioz/PyDev.Debugger. This effort was backed-up by Intellij, and from now on, work on the debugger from either front (PyDev or PyCharm) should benefit both -- pull requests are also very welcome :)

With this merge, PyDev users will gain GEvent debugging and breakpoints at Django templates (but note that the breakpoints can only be added through the LiClipse HTML/Django Templates editor), and in the interactive console front (which was also part of this merge), the asynchronous output and console interrupt are new.

This release also changed the default UI for the PyDev editor (and for LiClipse editors too), so, the minimap (which had a bunch of enhancements) is now turned on by default and the scrollbars are hidden by default -- those that prefer the old behavior must change the settings on the minimap preferences to match the old style.

Also noteworthy is that the code-completion for all letter chars is turned on by default (again, users that want the old behavior have to uncheck that setting from the code completion preferences page), and this release also has a bunch of bugfixes.

Now, I haven't talked about the crowdfunding for keeping the support on PyDev and a new profiler UI (https://sw-brainwy.rhcloud.com/support/pydev-2014) after it finished... well, it didn't reach its full goal -- in practice that means the profiler UI will still be done and users which supported it will receive a license to use it, but it won't be open source... all in all, it wasn't that bad either, it got halfway through its target and many people seemed to like the idea -- in the end I'll know if keeping it on even if not having the full target reached was a good idea or not only after it's commercially available (as the idea is that new licenses will be what will cover for its development expenses and will keep it going afterwards).

A note for profiler contributors is that I still haven't released an early-release version, but I'm working on it :)

As for PyDev, the outcome of the funding also means I have fewer resources to support it than I'd like, but given that LiClipse (http://brainwy.github.io/liclipse/) provides a share of its earnings to support PyDev and latecomers can still contribute through http://pydev.org/, I still hope that I won't need to lower its support (which'd mean taking on other projects in the time I currently have for PyDev), and I think it'll still be possible to do the things outlined in the crowdfunding regarding it.




26 Aug 2014 6:22pm GMT

Ian Ozsvald: Python Training courses: Data Science and High Performance Python coming in October

I'm pleased to say that via our ModelInsight we'll be running two Python-focused training courses in October. The goal is to give you new strong research & development skills, they're aimed at folks in companies but would suit folks in academia too. UPDATE training courses ready to buy (1 Day Data Science, 2 Day High Performance).

UPDATE we have a <5min anonymous survey which helps us learn your needs for Data Science training in London, please click through and answer the few questions so we know what training you need.

"Highly recommended - I attended in Aalborg in May "@ianozsvald:… upcoming Python DataSci/HighPerf training courses"" @ThomasArildsen

These and future courses will be announced on our London Python Data Science Training mailing list, sign-up for occasional announces about our upcoming courses (no spam, just occasional updates, you can unsubscribe at any time).

Intro to Data science with Python (1 day) on Friday 24th October

Students: Basic to Intermediate Pythonistas (you can already write scripts and you have some basic matrix experience)

Goal: Solve a complete data science problem (building a working and deployable recommendation engine) by working through the entire process - using numpy and pandas, applying test driven development, visualising the problem, deploying a tiny web application that serves the results (great for when you're back with your team!)

High Performance Python (2 day) on Thursday+Friday 30th+31st October

Students: Intermediate Pythonistas (you need higher performance for your Python code)

Goal: learn high performance techniques for performant computing, a mix of background theory and lots of hands-on pragmatic exercises

The High Performance course is built off of many years teaching and talking at conferences (including PyDataLondon 2013, PyCon 2013, EuroSciPy 2012) and in companies along with my High Performance Python book (O'Reilly). The data science course is built off of techniques we've used over the last few years to help clients solve data science problems. Both courses are very pragmatic, hands-on and will leave you with new skills that have been battle-tested by us (we use these approaches to quickly deliver correct and valuable data science solutions for our clients via ModelInsight). At PyCon 2012 my students rated me 4.64/5.0 for overall happiness with my High Performance teaching.

"@ianozsvald [..] Best tutorial of the 4 I attended was yours. Thanks for your time and preparation!" @cgoering

We'd also like to know which other courses you'd like to learn, we can partner with trainers as needed to deliver new courses in London. We're focused around Python, data science, high performance and pragmatic engineering. Drop me an email (via ModelInsight) and let me know if we can help.

Do please join our London Python Data Science Training mailing list to be kept informed about upcoming training courses.


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

26 Aug 2014 11:02am GMT

Ian Ozsvald: Python Training courses: Data Science and High Performance Python coming in October

I'm pleased to say that via our ModelInsight we'll be running two Python-focused training courses in October. The goal is to give you new strong research & development skills, they're aimed at folks in companies but would suit folks in academia too. UPDATE training courses ready to buy (1 Day Data Science, 2 Day High Performance).

UPDATE we have a <5min anonymous survey which helps us learn your needs for Data Science training in London, please click through and answer the few questions so we know what training you need.

"Highly recommended - I attended in Aalborg in May "@ianozsvald:… upcoming Python DataSci/HighPerf training courses"" @ThomasArildsen

These and future courses will be announced on our London Python Data Science Training mailing list, sign-up for occasional announces about our upcoming courses (no spam, just occasional updates, you can unsubscribe at any time).

Intro to Data science with Python (1 day) on Friday 24th October

Students: Basic to Intermediate Pythonistas (you can already write scripts and you have some basic matrix experience)

Goal: Solve a complete data science problem (building a working and deployable recommendation engine) by working through the entire process - using numpy and pandas, applying test driven development, visualising the problem, deploying a tiny web application that serves the results (great for when you're back with your team!)

High Performance Python (2 day) on Thursday+Friday 30th+31st October

Students: Intermediate Pythonistas (you need higher performance for your Python code)

Goal: learn high performance techniques for performant computing, a mix of background theory and lots of hands-on pragmatic exercises

The High Performance course is built off of many years teaching and talking at conferences (including PyDataLondon 2013, PyCon 2013, EuroSciPy 2012) and in companies along with my High Performance Python book (O'Reilly). The data science course is built off of techniques we've used over the last few years to help clients solve data science problems. Both courses are very pragmatic, hands-on and will leave you with new skills that have been battle-tested by us (we use these approaches to quickly deliver correct and valuable data science solutions for our clients via ModelInsight). At PyCon 2012 my students rated me 4.64/5.0 for overall happiness with my High Performance teaching.

"@ianozsvald [..] Best tutorial of the 4 I attended was yours. Thanks for your time and preparation!" @cgoering

We'd also like to know which other courses you'd like to learn, we can partner with trainers as needed to deliver new courses in London. We're focused around Python, data science, high performance and pragmatic engineering. Drop me an email (via ModelInsight) and let me know if we can help.

Do please join our London Python Data Science Training mailing list to be kept informed about upcoming training courses.


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

26 Aug 2014 11:02am GMT

11 Oct 2013

feedPython Software Foundation | GSoC'11 Students

Yeswanth Swami: How I kicked off GSoC

Zero to hero

What Prompted me??

I started my third year thinking I should do something that would put me different from the rest and one of my professors suggested me as to why don't I apply for GSoC. I don't know why but I took the suggestion rather seriously, thanks to the bet I had with one of my friend(who is about to complete his MBBS) that whoever earns first will buy the other a "RayBan shades". Well, that's it. I was determined. I started my research early, probably during the start of February(I knew I want to buy my friend, his shades and also buy mine too, in the process).

What experiences I had before??

I started looking at previous years' GSoC projects(having had little experience with Open Source) and started learning how to contribute. I was also very fascinated to the amount of knowledge one could gain just by googling and browsing web pages . I discovered very soon, as to what an immensely great tool , email, through which I could chat with anyone in the open source world and ask seemingly stupid questions and always expect to get a gentle reply back with an answer. Well, that held me spell bound and I knew I want to contribute to Open Source.

How did I begin??

About the middle of March, I discovered that my passion for Python as a programming language increased , after understanding how easy it is as a language. Added to that, my popularity among my fellow classmates increased when I started evangelizing Python(thanks to my seniors for introducing it, I guess I did a decent job popularizing the language). And I started contributing to PSF(Python Software Foundation) , started with a simple bug to fix documentation and slowly my interactivity in IRC increased and I started liking one of the project one of the community member proposed.

A twist in the story??

There I was, still a noob and not knowing how to convince my probable mentor that I could complete the project, given direction. About this juncture, a fellow student(from some university in France) mailed this particular mentor that he was interested in the project . Do, remember, I was part of the mailing list and follow the happenings of it. So, I was furious knowing that I had a competition(having put so much effort) and I was not willing to compromise my project (knowing that this is the one project I actually understood and started researching a little bit too). The other projects require me to have some domain knowledge. I went back to my teachers, seniors, friends and Google and started asking the question , "how would i solve the problem the mentor posted?" . I framed a couple of answers, though very noobish , but at least I could reply the email thread posting my understanding of the problem and how I would solve it and also ask various questions I had in my mind. Well, the mentor replied, immediately to my surprise, and responded back with comments as well as answers to the questions I posed. Again, my nemesis/competitor replied back(he having good knowledge about the problem domain). I knew it was not going to be easy. Hence, I went back again, through all my sources, made further understanding of the problem and posted back again. I guess, about 20 mails in the thread , till we(all three of us) decided we should catch up in IRC and discuss more.

The conclusion:

Well, at IRC , most of senior members from the community were present, and they suggested that they should probably extend the scope of the project(since two students were interested in one project and showed immense passion). Unsurprisingly, over multiple meetings, the project scope was expanded, both the students were given equal important but independent tasks and both the students got opportunity to say they are Google Summer of Code students. Thank goodness, we decided to built the project from scratch giving us more than enough work on our plate.

Movie titles:

1) In the open source world, there is no competition , it is only "COLLABORATION".

2) Why give up, when you can win??

3) From Zero to Hero!!

4) A prodigy in making

p.s. I still owe my friend his shades . *sshole, I am still waiting for him to visit me so that I can buy him his shades and buy mine too. Also, I know its been two years since the story happened, but it is never too late to share, don't you agree??


11 Oct 2013 5:39am GMT