Subscribe: Planet Python
http://www.planetpython.org/rss20.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
code  data  github  install  lis lis  lis  list  make  new  open  pip  python  stdout  time  twitter  user  work 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Planet Python

Planet Python



Planet Python - http://planetpython.org/



 



Carl Chenet: Feed2tweet 1.0, tool to post RSS feeds to Twitter, released

Thu, 23 Mar 2017 23:00:26 +0000

Feed2tweet 1.0, a self-hosted Python app to automatically post RSS feeds to the Twitter social network, was released March 2017, 23th.

The main new feature of this release allows to create filters for each RSS feed, because before you could only define global filters. Contributed by Antoine Beaupré, Feed2tweet is also able to use syslog, starting from this release.

(image) What’s the purpose of Feed2tweet?

Some online services offer to convert your RSS entries into Twitter posts. Theses services are usually not reliable, slow and don’t respect your privacy. Feed2tweet is Python self-hosted app, the source code is easy to read and you can enjoy the official documentation online with lots of examples.

Twitter Out Of The Browser

Have a look at my Github account for my other Twitter automation tools:

  • Retweet , retweet all (or using some filters) tweets from a Twitter account to another one to spread content.
  • db2twitter, get data from SQL database (several supported), build tweets and send them to Twitter
  • Twitterwatch, monitor the activity of your Twitter timeline and warn you if no new tweet appears

What about you? Do you use tools to automate the management of your Twitter account? Feel free to give me feedback in the comments below.

… and finally

You can help Feed2tweet by donating anything through Liberaypay (also possible with cryptocurrencies). That’s a big factor motivation (image)

 




Vasudev Ram: Analysing that Python code snippet

Thu, 23 Mar 2017 21:39:55 +0000

By Vasudev RamHi readers,Some days ago I had written this post:Analyse this Python code snippetin which I had shown a snippet of Python code (run in the Python shell), and said:"Analyse the snippet of Python code below. See what you make of it. I will discuss it in my next post."I am a few days late in discussing it; sorry about that.Here is the analysis:First, here's the the snippet again, for reference:>>> a = 1>>> lis = [a, 2 ]>>> lis[1, 2]>>> lis = [a, 2 ,... "abc", False ]>>>>>> lis[1, 2, 'abc', False]>>> a1>>> b = 3>>> lis[1, 2, 'abc', False]>>> a = b>>> a3>>> lis[1, 2, 'abc', False]>>> lis = [a, 2 ]>>> lis[3, 2]>>>The potential for confusion (at least, as I said, for newbie Pythonistas) lies in these apparent points:The variable a is set to 1.Then it is put into the list lis, along with the constant 2.Then lis is changed to be [a, 2, "abc", False].One might now think that the variable a is stored in the list lis.The next line prints its value, which shows it is 1.All fine so far.Then b is set to 3.Then a is set to b, i.e. to the value of b.So now a is 3.But when we print lis again, it still shows 1 for the first item, not 3, as some might expect (since a is now set to 3).Only when we run the next line:lis = [a, 2]and then print lis again, do we see that the first item in lis is now 3.This has to do with the concept of naming and binding in Python.When a Python statement like:a = 1is run, naming and binding happens. The name on the left is first created, and then bound to the (value of the) object on the right of the equals sign (the assignment operator). The value can be any expression, which, when evaluated, results in a value (a Python object [1]) of some kind. In this case it is the int object with value 1.[1] Almost everything in Python is an object, like almost everything in Unix is a file. [Conditions apply :)]When that name, a, is used in an expression, Python looks up the value of the object that the name is bound to, and uses that value in the expression, in place of the name.So when the name a was used inside any of the lists that were bound to the name lis, it was actually the value bound to the name a that was used instead. So, the first time it was 1, so the first item of the list became 1, and stayed as 1 until another binding of some other (list) object to the name lis was done.But by this time, the name a had been rebound to another object, the int 3, the same one that name b had been earlier bound to just before. So the next time that the name lis was bound to a list, that list now included the value of the current object that name a was now bound to, which was 3.This is the reason why the code snippet works as it does.On a related note (also about Python language features, syntax and semantics), I was playing around with the pprint module (Python's pretty-printer) and the Python is operator, and came up with this other snippet:>>> import pprint>>> lis = []>>> for i in range(10):... lis.append(lis)...>>> print lis[[...], [...], [...], [...], [...], [...], [...], [...], [...], [...]]>>> pprint.pprint(lis)[,,,,,,,,,]>>> len(lis)10>>> lis is lis[0]True>>> lis is lis[0] is lis[0][0]True>>> lis is lis[0] is lis[0][0] is lis[0][0][0]Truein which I created a list, appended it to itself, and then used pprint.pprint on it. Also used the Python is operator between the list and its 0th item, recursively, and was interested to see that the is operator ca[...]



NumFOCUS: PyData Atlanta Meetup Celebrates 1 Year

Thu, 23 Mar 2017 21:23:50 +0000

PyData Atlanta holds a meetup at MailChimp, where Jim Crozier spoke about analyzing NFL data with PySpark. Atlanta tells a new story about data by Rob ClewleyIn late 2015, the three of us (Tony Fast, Neel Shivdasani, and myself) had been regularly  nerding out about data over beers and becoming fast friends. We were eager to see a shift from Atlanta's data community to be more welcoming and encouraging towards beginners, self-starters, and generalists. Were about to find out that we were not alone.We had met at local data science-related events earlier in the year and had discovered that we had lots of opinions—and weren’t afraid to advocate for them. But we also found that we listened to reason (data-driven learning!), appreciated the art in doing good science, and cared about people and the community. Open science, open data, free-and-open-source software, and creative forms of technical communication and learning were all recurring themes in our conversations. We also all agreed that Python is a great language for working with data.Invitations were extended to like-minded friends, and the informal hangout was soon known as “Data Beers”. The consistent good buzz that Data Beers generated helped us realize an opportunity to contribute more widely to the Atlanta community. At the time, Atlanta was beginning its emergence as a new hub in the tech world and startup culture.Some of the existing data-oriented meetups around Atlanta have a more formal business atmosphere, or are highly focused on specific tools or tech opinions. We have found that such environments have seemed to intimidate newcomers and those less formally educated in math or computer science. This inspired us to take a new perspective through an informal and eclectic approach oriented towards beginners, self-starters, and generalists. So, in January 2016, with the support of not-for-profit organization NumFOCUS, we set up the Atlanta chapter of PyData. The mission of NumFOCUS is to promote sustainable high-level programming languages, open code development, and reproducible scientific research. NumFOCUS sponsors PyData conferences and local meetups internationally. The PyData community gathers to discuss how best to apply tools using Python, R, Stan, and Julia to meet evolving challenges in data management, processing, analytics, and visualization. In all, PyData is over 28,000 members across 52 international meetups. The Python language and the data-focused ecosystem that has grown around it has been remarkably successful in attracting an inclusive mindset centered around free and open-source software and science. Our Atlanta chapter aims to be even more neutral about specific technologies so long as the underlying spirit resonates with our mission. The three of us, with the help of friend and colleague Lizzy Rolando, began sourcing great speakers who have a distinctive approach to using data that resonated with the local tech culture. We hosted our first meetup in early April. From the beginning, we encouraged a do-it-yourself, interactive vibe to our meetings, supporting shorter-format 30 minute presentations with 20 minute question and answer sessions.Regardless of the technical focus, we try to bring in speakers who are applying their data-driven work to something of general interest. Our programming balances technical and more qualitative talks. Our meetings have covered a diverse range of applications, addressing computer literacy and education, human rights, neuroscience, journalism, and civics.A crowd favorite is the inclusion of 3-4 audience-submitted lightning talks at the end of the main Q&A. The strictly five-minute talks add more energy to the mix and give a wider platform to the local community. They’re an opportunity to practice presentation skills for students, generate conversations around projects needing collaborators, discussions about new tools, or just have fun looking at interesting data sets.Students, career changers, and profes[...]



Reinout van Rees: Fossgis: open source for emergencies - Marco Lechner

Thu, 23 Mar 2017 14:28:00 +0000

(One of my summaries of a talk at the 2017 fossgis conference).

He works for the Bundesamtes fuer Strahlenschutz, basically the government agency that was started after Chernobil to protect against and to measure radioactivity. The software system they use/build is called IMIS.

IMIS consists of three parts:

  • Measurements (automatic + mobile measurements + laboratory results).
  • Prediction system. Including documentation (managed in Plone, a python CMS system).
  • Decision support. Help support the government layers that have to make the decisions.

They have a simple map at odlinfo.bfs.de.

The current core of the system is proprietary. They are dependent on one single firm. The system is heavily customized for their usage.

They need a new system because geographical analysis keeps getting more important and because there are new requirements coming out of the government. The current program cannot handle that.

What they want is a new system that is as simple as possible; that uses standards for geographical exchange; they don't want to be dependent on a single firm anymore. So:

  • Use open standards, so OGC. But also a specific world-wide nuclear info protocol.
  • Use existing open source software. OSGEO.
  • If we need something special, can we change/extend existing open source software?
  • If not, then it is OK to create our their software. Under an open source license.

They use open source companies to help them, including training their employees. And helping getting these employees used to modern software development (jenkins, docker, etc.)

If you use an open source strategy, what do you need to do to make it fair?

  • Your own developments should also be open source!
  • You need your own test and build infrastructure. (For instance Jenkins)
  • You need to make it easy to start working with what you made: documentation, docker, buildout (!), etc.

(Personal note: I didn't expect to hear 'buildout' at this open source GIS conference. I've helped quite a bit with that particular piece of python software :-) )




PyBites: Module of the Week - ipaddress

Thu, 23 Mar 2017 10:30:00 +0000

While playing around with code for our post on generators we discovered the ipaddress module, part of the Standard Library. Such a handy little module!




Reinout van Rees: Fossgis: sewer cadastre with qgis - jörg Höttges

Thu, 23 Mar 2017 10:24:00 +0000

(One of my summaries of a talk at the 2017 fossgis conference).

With engineer firms from the Aachen region they created qkan. Qkan is:

  • A data structure.
  • Plugins for Qgis.
  • Direct access. Not a specific application with restricted access, but unrestricted access from within Qgis. (He noticed lots of interest among the engineers to learn qgis during the project!)

It has been designed for the needs of the engineers that have to work with the data. You first import the data from the local sewer database. Qkan converts the data to what it needs. Then you can do simulations in a separate package. The results of the simulation will be visualized by Qkan in qgis. Afterwards you probably have to make some corrections to the data and give corrections back to the original database. Often you have to go look at the actual sewers to make sure the database is correct. Output is often a map with the sewer system.

Some functionality: import sewer data (in various formats). Simulate water levels. Draw graphs of the water levels in a sewer. Support database-level check ("an end node cannot occur halfway a sewer").

They took care to make the database schema simple. The source sewer database is always very complex because it has to hold lots of metadata. The engineer that has to work with it needs a much simpler schema in order to be productive. Qkan does this.

They used qgis, spatialite, postgis, python and qt (for forms). An important note: they used as many postgis functionality as possible instead of the geographical functions from qgis: the reason is that postgis (and even spatialite) is often much quicker.

With qgis, python and the "qt designer", you can make lots of handy forms. But you can always go back to the database that's underneath it.

The code is at https://github.com/hoettges




CubicWeb: Introducing cubicweb-jsonschema

Thu, 23 Mar 2017 09:57:00 +0000

This is the first post of a series introducing the cubicweb-jsonschema project that is currently under development at Logilab. In this post, I'll first introduce the general goals of the project and then present in more details two aspects about data models (the connection between Yams and JSON schema in particular) and the basic features of the API. This post does not always present how things work in the current implementation but rather how they should. Goals of cubicweb-jsonschema From a high level point of view, cubicweb-jsonschema addresses mainly two interconnected aspects. One related to modelling for client-side development of user interfaces to CubicWeb applications while the other one concerns the HTTP API. As far as modelling is concerned, cubicweb-jsonschema essentially aims at providing a transformation mechanism between a Yams schema and JSON Schema that is both automatic and extensible. This means that we can ultimately expect that Yams definitions alone would sufficient to have generated JSON schema definitions that would consistent enough to build an UI, pretty much as it is currently with the automatic web UI in CubicWeb. A corollary of this goal is that we want JSON schema definitions to match their context of usage, meaning that a JSON schema definition would not be the same in the context of viewing, editing or relationships manipulations. In terms of API, cubicweb-jsonschema essentially aims at providing an HTTP API to manipulate entities based on their JSON Schema definitions. Finally, the ultimate goal is to expose an hypermedia API for a CubicWeb application in order to be able to ultimately build an intelligent client. For this we'll build upon the JSON Hyper-Schema specification. This aspect will be discussed in a later post. Basic usage as an HTTP API library Consider a simple case where one wants to manipulate entities of type Author described by the following Yams schema definition: class Author(EntityType): name = String(required=True) With cubicweb-jsonschema one can get JSON Schema for this entity type in at different contexts such: view, creation or edition. For instance: in a view context, the JSON Schema will be: { "$ref": "#/definitions/Author", "definitions": { "Author": { "additionalProperties": false, "properties": { "name": { "title": "name", "type": "string" } }, "title": "Author", "type": "object" } } } whereas in creation context, it'll be: { "$ref": "#/definitions/Author", "definitions": { "Author": { "additionalProperties": false, "properties": { "name": { "title": "name", "type": "string" } }, "required": [ "name" ], "title": "Author", "type": "object" } } } (notice, the required keyword listing name property). Such JSON Schema definitions are automatically generated from Yams definitions. In addition, cubicweb-jsonschema exposes some endpoints for basic CRUD operations on resources through an HTTP (JSON) API. From the client point of view, requests on these endpoints are of course expected to match JSON Schema definitions. Some examples: Get an author resource: GET /author/855 Accept:application/json HTTP/1.1 200 OK Content-Type: application/json {"name": "Ernest Hemingway"} Update an author: PATCH /author/855 Accept:application/json Content-Type: application/json {"name": "Ernest Miller Hemingway"} HTTP/1.1 200 OK Location: /author/855/ Content-Type: application/json {"name": "Ernest Miller Hemingway"} Create an author: POST /author Accept:application/json Content-Type: application/json {"name": "Victor Hugo"} HTTP/1.1 201 Created Co[...]



Reinout van Rees: Fossgis: creating maps with open street map in QGis - Axel Heinemann

Thu, 23 Mar 2017 09:55:00 +0000

(One of my summaries of a talk at the 2017 fossgis conference).

He wanted to make a map for a local run. He wanted a nice map with the route and the infrastructure (start, end, parking, etc). Instead of the usual not-quite-readable city plan with a simple line on top. With qgis and openstreetmap he should be able to make something better!

A quick try with QGis, combined with the standard openstreetmap base map, already looked quite nice, but he wanted to do more customizations on the map colors. So he needed to download the openstreetmap data. That turned into quite a challenge. He tried two plugins:

  • OSMDownloader: easy selection, quick download. Drawback: too many objects as you cannot filter. The attribute table is hard to read.
  • QuickOSM: key/value selection, quick. Drawback: you need a bit of experience with the tool, as it is easy to forget key/values.

He then landed on https://overpass-turbo.eu . The user interface is very friendly. There is a wizard to get common cases done. And you can browse the available tags.

With the data downloaded with overpass-turbo, he could easily adjust colors and get a much nicer map out of it.

You can get it to work, but it takes a lot of custom work.

Some useful links:

https://taginfo.openstreetmap.org http://tagfinder.herokuapp.com https://gis.stackexchange.com

(image)

Photo explanation: just a nice unrelated picture from the recent beautiful 'on traxs' model railway exibition (see video )




Reinout van Rees: Fossgis: introduction on some open source software packages

Thu, 23 Mar 2017 09:55:00 +0000

(One of my summaries of a talk at the 2017 fossgis conference). The conference started with a quick introduction on several open source programs. Openlayers 3 - Marc Jansen Marc works on both openlayers and GeoExt. Openlayers is a javascript library with lots and lots of features. To see what it can do, look at the 161 examples on the website :-) It works with both vector layers and raster layers. Openlayers is a quite mature project, the first version is from 2006. It changed a lot to keep up with the state of the art. But they did take care to keep everything backwards compatible. Upgrading from 2.0 to 2.2 should have been relatively easy. The 4.0.0 version came out last month. Openlayers... Allows many different data sources and layer types. Has build-in interaction and controls. Is very actively developed. Is well documented and has lots of examples. The aim is to be easy to start with, but also to allow full control of your map and all sorts of customization. Geoserver - Marc Jansen (Again Marc: someone was sick...) Geoserver is a java-based server for geographical data. It support lots of OGC standards (WMS, WFS, WPS, etc). Flexible, extensible, well documented. "Geoserver is a glorious example that you can write very performant software in java". Geoserver can connect to many different data sources and make those sources available as map data. If you're a government agency, you're required to make INSPIRE metadata available for your maps: geoserver can help you with that. A big advantage of geoserver: it has a browser-based interface for configuring it. You can do 99% of your configuration work in the browser. For maintaining: there is monitoring to keep an eye on it. Something to look at: the importer plugin. With it you get a REST API to upload shapes, for instance. The latest version also supports LDAP groups. LDAP was already supported, but group membership not yet. Mapproxy - Dominik Helle Dominik is one of the MapProxy developers. Mapproxy is a WMS cache and tile cache. The original goal was to make maps quicker by caching maps. Some possible sources: WMS, WMTS, tiles (google/bing/etc), MapServer. The output can be WMS, WMS-C, WMTS, TMS, KML. So the input could be google maps and the output WMS. One of their customers combines the output of five different regional organisations into one WMS layer... The maps that mapproxy returns can be stored on a local disk in order to improve performance. They way they store it allows mapproxy to support intermediary zoom levels instead of fixed ones. The cache can be in various formats: MBTiles, sqlite, couchdb, riak, arcgis compact cache, redis, s3. The cache is efficient by combining layers and by omitting unneeded data (empty tiles). You can pre-fill the cache ("seeding"). Some other possibilities, apart from caching: A nice feature: clipping. You can limit a source map to a specific area. Reprojecting from one coordinate system to another. Very handy if someone else doesn't want to support the coordinate system that you need. WMS feature info: you can just pass it on to the backend system, but you can also intercept and change it. Protection of your layers. Password protection. Protect specific layers. Only allow specific areas. Etcetera. QGis - Thomas Schüttenberg QGis is an opern source gis platform. Desktop, server, browser, mobile. And it is a library. It runs on osx, linux, windows, android. The base is the QT ui library, hence the name. Qgis contains almost everything you'd expect from a GIS packages. You can extend it with plugins. Qgis is a very, very active project. Almost 1 million lines of code. 30.000+ github commits. 332 developers have worked on it, in the last 12 months 104. Support via documentation, mailinglists and http://gis.stackexchange.com/ . In case you're wondering about the names of the releases[...]



Tomasz Früboes: Unittesting print statements

Thu, 23 Mar 2017 09:10:58 +0000

Recently I was refactoring a small package that is supposed to allow execution of arbitrary python code on a remote machine. The first implementation was working nicely but with one serious drawback – function handling the actual code execution was running in a synchronous (blocking) mode. As the result all of the output (both stdout and stderr) was presented only at the end, i.e. when code finished its execution. This was unacceptable since the package should work in a way as transparent to the user as possible. So a wall of text when code completes its task wasn’t acceptable. The goal of the refactoring was simple – to have the output presented to the user immediately after it was printed on the remote host. As a TDD worshipper I wanted to start this in a kosher way, i.e. with a test. And I got stuck. For a day or so I had no idea how to progress. How do you unittest the print statements? It’s funny when I think about this now. I have used a similar technique many times in the past for output redirection, yet somehow haven’t managed to make a connection with this problem. The print statement So how do you do it? First we should understand what happens when print statement is executed. In python 2.x the print statement does two things – converts provided expressions into strings and writes the result to a file like object handling the stdout. Conveniently it is available as sys.stdout (i.e. as a part of sys module). So all you have to do is to overwrite the sys.stdout with your own object providing a ‘write’ method. Later you may discover, that some other methods may be also needed (e.g. ‘flush’ is quite often used), but for starters, having only the ‘write’ method should be sufficient. A first try – simple stdout interceptor The code below does just that. The MyOutput class is designed to replace the original sys.stdout: import unittest import sys def fn_print(nrepeat): print "ab"*nrepeat class MyTest(unittest.TestCase): def test_stdout(self): class MyOutput(object): def __init__(self): self.data = [] def write(self, s): self.data.append(s) def __str__(self): return "".join(self.data) stdout_org = sys.stdout my_stdout = MyOutput() try: sys.stdout = my_stdout fn_print(2) finally: sys.stdout = stdout_org self.assertEquals( str(my_stdout), "abab\n") if __name__ == "__main__": unittest.main() The fn_print function provides output to test against. After replacing sys.stdout we call this function and compare the obtained output with the expected one. It is worth noting that in the example above the original sys.stdout is first preserved and then carefully restored inside the ‘finally’ block. If you don’t do this you are likely to loose any output coming from other tests. Is my code async? Logging time of arrival In the second example we will address the original problem – is output presented as a wall of text at the end or maybe in real time as we want to. For this we will add time of arrival logging capability to the object replacing sys.stdout: import unittest import time import sys def fn_print_with_delay(nrepeat): for i in xrange(nrepeat): print # prints a single newline time.sleep(0.5) class TestServer(unittest.TestCase): def test_stdout_time(self): class TimeLoggingOutput(object): def __init__(self): self.data = [] self.timestamps = [] def write(self, s): self.timestamps.append(time.time()) self.data.append(s) stdou[...]



DataCamp: PySpark Cheat Sheet: Spark in Python

Thu, 23 Mar 2017 09:10:07 +0000

Apache Spark is generally known as a fast, general and open-source engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. It allows you to speed analytic applications up to 100 times faster compared to technologies on the market today. You can interface Spark with Python through "PySpark". This is the Spark Python API exposes the Spark programming model to Python. 

Even though working with Spark will remind you in many ways of working with Pandas DataFrames, you'll also see that it can be tough getting familiar with all the functions that you can use to query, transform, inspect, ... your data. What's more, if you've never worked with any other programming language or if you're new to the field, it might be hard to distinguish between RDD operations.

Let's face it, map() and flatMap() are different enough, but it might still come as a challenge to decide which one you really need when you're faced with them in your analysis. Or what about other functions, like reduce() and reduceByKey()

(image)

Even though the documentation is very elaborate, it never hurts to have a cheat sheet by your side, especially when you're just getting into it.

This PySpark cheat sheet covers the basics, from initializing Spark and loading your data, to retrieving RDD information, sorting, filtering and sampling your data. But that's not all. You'll also see that topics such as repartitioning, iterating, merging, saving your data and stopping the SparkContext are included in the cheat sheet. 

Note that the examples in the document take small data sets to illustrate the effect of specific functions on your data. In real life data analysis, you'll be using Spark to analyze big data.

Are you hungry for more? Don't miss our other Python cheat sheets for data science that cover topics such as Python basicsNumpyPandasPandas Data Wrangling and much more! 




Rene Dudfield: pip is broken

Thu, 23 Mar 2017 09:00:46 +0000

Help?Since asking people to use pip to install things, I get a lot of feedback on pip not working. Feedback like this."Our fun packaging Jargon"What is a pip? What's it for? It's not built into python?  It's the almost-default and almost-standard tool for installing python code. Pip almost works a lot of the time. You install things from pypi. I should download pypy? No, pee why, pee eye. The cheeseshop. You're weird. Just call it pee why pee eye. But why is it called pip? I don't know."Feedback like this."pip is broken on the raspberianpip3 doesn't exist on windowsPeople have an old pip. Old pip doesn't support wheels. What are wheels? It's a cute bit of jargon to mean a zip file with python code in it structured in a nice way. I heard about eggs... tell me about eggs? Well, eggs are another zip file with python code in it. Used mainly by easy_install. Easy install? Let's use that, this is all too much.The pip executable or script is for python 2, and they are using python 3.pip is for a system python, and they have another python installed. How did they install that python? Which of the several pythons did they install? Maybe if they install another python it will work this time.It's not working one time and they think that sudo will fix things. And now certain files can't be updated without sudo. However, now they have forgotten that sudo exists."pip lets you run it with sudo, without warning."pip doesn't tell them which python it is installing for. But I installed it! Yes you did. But which version of python, and into which virtualenv? Let's use these cryptic commands to try and find out...pip doesn't install things atomically, so if there is a failed install, things break. If pip was a database (it is)...Virtual environments work if you use python -m venv, but not virtualenv. Or some times it's the other way around. If you have the right packages installed on Debian, and Ubuntu... because they don't install virtualenv by default.What do you mean I can't rename my virtualenv folder? I can't move it to another place on my Desktop?pip installs things into global places by default."Globals by default."Why are packages still installed globally by default?"So what works currently most of the time?"python3 -m venv anenv. ./anenv/bin/activatepip install pip --upgradepip install pygameThis is not ideal. It doesn't work on windows. It doesn't work on Ubuntu. It makes some text editors crash (because virtualenvs have so many files they get sick). It confuses test discovery (because for some reason they don't know about virtual environments still and try to test random packages you have installed). You have to know about virtualenv, about pip, about running things with modules, about environment variables, and system paths. You have to know that at the beginning. Before you know anything at all.Is there even one set of instructions where people can have a new environment, and install something? Install something in a way that it might not break their other applications? In a way which won't cause them harm? Please let me know the magic words?I just tell people `pip install pygame`. Even though I know it doesn't work. And can't work. By design. I tell them to do that, because it's probably the best we got. And pip keeps getting better. And one day it will be even better.Help? Let's fix this.[...]



Talk Python to Me: #104 Game Theory in Python

Thu, 23 Mar 2017 08:00:00 +0000

Game theory is the study competing interests, be it individual actors within an economy or healthy vs. cancer cells within a body.

Our guests this week, Vince Knight, Marc Harper, and Owen Campbell, are here to discuss their python project built to study and simulate one of the central problems in Game Theory: The prisoners' dilemma.

Links from the show:

Axelrod on GitHub: github.com/Axelrod-Python/Axelrod
The docs: axelrod.readthedocs.io/en/latest
The tournament: axelrod-tournament.readthedocs.io/en/latest
Chat: Gitter room: gitter.im/Axelrod-Python
Peer reviewed paper: openresearchsoftware.metajnl.com/articles/10.5334/jors.125
Djaxelrod v2: github.com/Axelrod-Python/axelrod-api
Some examples with jupyter: github.com/Axelrod-Python/Axelrod-notebooks

Find them on Twitter
The project: @AxelrodPython
Owen on Twitter: @opcampbell
Vince on on Twitter: @drvinceknight

Sponsored items
Our courses: training.talkpython.fm
Podcast's Patreon: patreon.com/mkennedy



Kushal Das: Running MicroPython on 96Boards Carbon

Thu, 23 Mar 2017 06:42:00 +0000

I received my Carbon from Seedstudio a few months back. But, I never found time to sit down and work on it. During FOSSASIA, in my MicroPython workshop, Siddhesh was working to put MicroPython using Zephyr on his Carbon. That gave me the motivation to have a look at the same after coming back home. What is Carbon? Carbon is a 96Boards IoT edition compatible board, with a Cortex-M4 chip, and 512KB flash. It currently runs Zephyr, which is a Linux Foundation hosted project to build a scalable real-time operating system (RTOS). Setup MicroPython on Carbon To install the dependencies in Fedora: $ sudo dnf group install "Development Tools" $ sudo dnf install git make gcc glibc-static \ libstdc++-static python3-ply ncurses-devel \ python-yaml python2 dfu-util The next step is to setup the Zephyr SDK. You can download the latest binary from here. Then you can install it under your home directory (you don’t have to install it system-wide). I installed it under ~/opt/zephyr-sdk-0.9 location. Next, I had to check out the zephyr source, I cloned from https://git.linaro.org/lite/zephyr.git repo. I also cloned MicroPython from the official GitHub repo. I will just copy paste the next steps below. $ source zephyr-env.sh $ cd ~/code/git/ $ git clone https://github.com/micropython/micropython.git $ cd micropython/zephyr Then I created a project file for the carbon board specially, this file is named as prj_96b_carbon.conf, and I am pasting the content below. I have submitted the same as a patch to the upstream Micropython project. It disables networking (otherwise you will get stuck while trying to get the REPL). # No networking for carbon CONFIG_NETWORKING=n CONFIG_NET_IPV4=n CONFIG_NET_IPV6= Next, we have to build MicroPython as a Zephyr application. $ make BOARD=96b_carbon $ ls outdir/96b_carbon/ arch ext isr_tables.c lib Makefile scripts tests zephyr.hex zephyr.map zephyr.strip boards include isr_tables.o libzephyr.a Makefile.export src zephyr.bin zephyr.lnk zephyr_prebuilt.elf drivers isrList.bin kernel linker.cmd misc subsys zephyr.elf zephyr.lst zephyr.stat After the build is finished, you will be able to see a zephyr.bin file in the output directory. Uploading the fresh build to the carbon Before anything else, I connected my Carbon board to the laptop using an USB cable to the OTG port (remember to check the port name). Then, I had to press the *BOOT0 button and while pressing that one, I also pressed the Reset button. Then, left the reset button first, and then the boot0 button. If you run the dfu-util command after this, you should be able to see some output like below. $ sudo dfu-util -l dfu-util 0.9 Copyright 2005-2009 Weston Schmidt, Harald Welte and OpenMoko Inc. Copyright 2010-2016 Tormod Volden and Stefan Schmidt This program is Free Software and has ABSOLUTELY NO WARRANTY Please report bugs to http://sourceforge.net/p/dfu-util/tickets/ Found DFU: [0483:df11] ver=2200, devnum=14, cfg=1, intf=0, path="2-2", alt=3, name="@Device Feature/0xFFFF0000/01*004 e", serial="385B38683234" Found DFU: [0483:df11] ver=2200, devnum=14, cfg=1, intf=0, path="2-2", alt=2, name="@OTP Memory /0x1FFF7800/01*512 e,01*016 e", serial="385B38683234" Found DFU: [0483:df11] ver=2200, devnum=14, cfg=1, intf=0, path="2-2", alt=1, name="@Option Bytes /0x1FFFC000/01*016 e", serial="385B38683234" Found DFU: [0483:df11] ver=2200, devnum=14, cfg=1, intf=0, path="2-2", alt=0, name="@Internal Flash /0x08000000/04*016Kg,01*064Kg,03*128Kg", serial="385B386832[...]



Fabio Zadrozny: PyDev 5.6.0 released: faster debugger, improved type inference for super and pytest fixtures

Thu, 23 Mar 2017 04:29:17 +0000

PyDev 5.6.0 is now already available for download (and is already bundled in LiClipse 3.5.0).There are many improvements on this version!The major one is that the PyDev.Debugger got some attention and should now be 60%-100% faster overall -- in all supported Python versions (and that's on top of the improvements done previously).This improvement was a nice example of trading memory vs speed (the major change done was that the debugger now has 2 new caches, one for saving whether a frame should be skipped or not and another to save whether a given line in a traced frame should be skipped or not, which enables the debugger to make much less checks on those occasions).Also, other fixes were done in the debugger. Namely:the variables are now properly displayed when the interactive console is connected to a debug session;it's possible to select the Qt version for which QThreads should be patched for the debugger to work with (in preferences > PyDev > Debug > Qt Threads);fixed an issue where a native Qt signal is not callable message was raised when connecting a signal to QThread.started.fixed issue displaying variable (Ctrl+Shift+D) when debugging.Note: from this version onward, the debugger will now only support Python 2.6+ (I believe there should be very few Python 2.5 users -- Python 2.6 itself stopped being supported in 2013, so, I expect this change to affect almost no one -- if someone really needs to use an older version of Python, it's always possible to get an older version of the IDE/debugger too). Also, from now on, supported versions are actually properly tested on the ci (2.6, 2.7 and 3.5 in https://travis-ci.org/fabioz/PyDev.Debugger and 2.7, 3.5 in https://ci.appveyor.com/project/fabioz/pydev-debugger).The code-completion (Ctrl+Space) and find definition (F3) also had improvements and can now deal with the Python super (so, it's possible to get completions and go to the definition of a method declared in a superclass when using the super construct) and pytest fixtures (so, if you have a pytest fixture, you should now be able to have completions/go to its definition even if you don't add a docstring to the parameter saying its expected type).Also, this release improved the support in third-party packages, so, coverage, pycodestyle (previously pep8.py) and autopep8 now use the latest version available. Also, PyLint was improved to use the same thread pool used in code-analysis and an issue in the Django shell was fixed when django >= 1.10.And to finish, the preferences for running unit-tests can now be saved to the project or user settings (i.e.: preferences > PyDev > PyUnit > Save to ...) and an issue was fixed when coloring the matrix multiplication operator (which was wrongly recognized as a decorator).Thank you very much to all the PyDev supporters and Patrons (http://www.patreon.com/fabioz), who help to keep PyDev moving forward and to JetBrains, which sponsored many of the improvements done in the PyDev.Debugger.[...]



Matthew Rocklin: Dask Release 0.14.1

Thu, 23 Mar 2017 00:00:00 +0000

This work is supported by Continuum Analytics, the XDATA Program, and the Data Driven Discovery Initiative from the Moore Foundation. I’m pleased to announce the release of Dask version 0.14.1. This release contains a variety of performance and feature improvements. This blogpost includes some notable features and changes since the last release on February 27th. As always you can conda install from conda-forge conda install -c conda-forge dask distributed or you can pip install from PyPI pip install dask[complete] --upgrade Arrays Recent work in distributed computing and machine learning have motivated new performance-oriented and usability changes to how we handle arrays. Automatic chunking and operation on NumPy arrays Many interactions between Dask arrays and NumPy arrays work smoothly. NumPy arrays are made lazy and are appropriately chunked to match the operation and the Dask array. >>> x = np.ones(10) # a numpy array >>> y = da.arange(10, chunks=(5,)) # a dask array >>> z = x + y # combined become a dask.array >>> z dask.array >>> z.compute() array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]) Reshape Reshaping distributed arrays is simple in simple cases, and can be quite complex in complex cases. Reshape now supports a much more broad set of shape transformations where any dimension is collapsed or merged to other dimensions. >>> x = da.ones((2, 3, 4, 5, 6), chunks=(2, 2, 2, 2, 2)) >>> x.reshape((6, 2, 2, 30, 1)) dask.array This operation ends up being quite useful in a number of distributed array cases. Optimize Slicing to Minimize Communication Dask.array slicing optimizations are now careful to produce graphs that avoid situations that could cause excess inter-worker communication. The details of how they do this is a bit out of scope for a short blogpost, but the history here is interesting. Historically dask.arrays were used almost exclusively by researchers with large on-disk arrays stored as HDF5 or NetCDF files. These users primarily used the single machine multi-threaded scheduler. We heavily tailored Dask array optimizations to this situation and made that community pretty happy. Now as some of that community switches to cluster computing on larger datasets the optimization goals shift a bit. We have tons of distributed disk bandwidth but really want to avoid communicating large results between workers. Supporting both use cases is possible and I think that we’ve achieved that in this release so far, but it’s starting to require increasing levels of care. Micro-optimizations With distributed computing also comes larger graphs and a growing importance of graph-creation overhead. This has been optimized somewhat in this release. We expect this to be a focus going forward. DataFrames Set_index Set_index is smarter in two ways: If you set_index on a column that happens to be sorted then we’ll identify that and avoid a costly shuffle. This was always possible with the sorted= keyword but users rarely used this feature. Now this is automatic. Similarly when setting the index we can look at the size of the data and determine if there are too many or too few partitions and rechunk the data while shuffling. This can significantly improve performance if there are too many partitions (a common case). dask/dask #2025 dask/dask #2091 Shuffle performance We’ve micro-optimized some parts of datafra[...]



Tarek Ziade: Load Testing at Mozilla

Wed, 22 Mar 2017 23:00:00 +0000

After a stabilization phase, I am happy to announce that Molotov 1.0 has been released! (Logo by Juan Pablo Bravo) This release is an excellent opportunity to explain a little bit how we do load testing at Mozilla, and what we're planning to do in 2017 to improve the process. I am talking here specifically about load testing our HTTP services, and when this blog post mentions what Mozilla is doing there, it refers mainly to the Mozilla QA team, helped with Services developers team that works on some of our web services. What's Molotov? Molotov is a simple load testing tool Molotov is a minimalist load testing tool you can use to load test an HTTP API using Python. Molotov leverages Python 3.5+ asyncio and uses aiohttp to send some HTTP requests. Writing load tests with Molotov is done by decorating asynchronous Python functions with the @scenario function: from molotov import scenario @scenario(100) async def my_test(session): async with session.get('http://localhost:8080') as resp: assert resp.status == 200 When this script is executed with the molotov command, the my_test function is going to be repeatedly called to perform the load test. Molotov tries to be as transparent as possible and just hands over session objects from the aiohttp.client module. The full documentation is here: http://molotov.readthedocs.io Using Molotov is the first step to load test our services. From our laptops, we can run that script and hammer a service to make sure it can hold some minimal charge. What Molotov is not Molotov is not a fully-featured load testing solution Load testing application usually comes with high-level features to understand how the tested app is performing. Things like performance metrics are displayed when you run a test, like what Apache Bench does by displaying how many requests it was able to perform and their average response time. But when you are testing web services stacks, the metrics you are going to collect from each client attacking your service will include a lot of variation because of the network and clients CPU overhead. In other words, you cannot guarantee reproducibility from one test to the other to track precisely how your app evolves over time. Adding metrics directly in the tested application itself is much more reliable, and that's what we're doing these days at Mozilla. That's also why I have not included any client-side metrics in Molotov, besides a very simple StatsD integration. When we run Molotov at Mozilla, we mostly watch our centralized metrics dashboards and see how the tested app behaves regarding CPU, RAM, Requests-Per-Second, etc. Of course, running a load test from a laptop is less than ideal. We want to avoid the hassle of asking people to install Molotov & all the dependencies a test requires everytime they want to load test a deployment -- and run something from their desktop. Doing load tests occasionally from your laptop is fine, but it's not a sustainable process. And even though a single laptop can generate a lot of loads (in one project, we're generating around 30k requests per second from one laptop, and happily killing the service), we also want to do some distributed load. We want to run Molotov from the cloud. And that's what we do, thanks to Docker and Loads. Molotov & Docker Since running the Molotov command mostly consists of using the right command-line options and passing a test script, we've added in Molotov a second command-line utility called moloslave. Moloslave takes the URL of a git repository and will clone it and run the molotov test that's in it by reading a configuration file. The configura[...]



DataCamp: SciPy Cheat Sheet: Linear Algebra in Python

Wed, 22 Mar 2017 18:36:45 +0000

By now, you will have already learned that NumPy, one of the fundamental packages for scientific computing, forms at least for a part the fundament of other important packages that you might use used for data manipulation and machine learning with Python. One of those packages is SciPy, another one of the core packages for scientific computing in Python that provides mathematical algorithms and convenience functions built on the NumPy extension of Python.  You might now wonder why this library might come in handy for data science.  Well, SciPy has many modules that will help you to understand some of the basic components that you need to master when you're learning data science, namely, math, stats and machine learning. You can find out what other things you need to tackle to learn data science here. You'll see that for statistics, for example, a module like scipy.stats, etc. will definitely be of interest to you. The other topic that was mentioned was machine learning: here, the scipy.linalg and scipy.sparse modules will offer everything that you're looking for to understand machine learning concepts such as eigenvalues, regression, and matrix multiplication... But, what is maybe the most obvious is that most machine learning techniques deal with high-dimensional data and that data is often represented as matrices. What's more, you'll need to understand how to manipulate these matrices.   That is why DataCamp has made a SciPy cheat sheet that will help you to master linear algebra with Python.  Take a look by clicking on the button below: You'll see that this SciPy cheat sheet covers the basics of linear algebra that you need to get started: it provides a brief explanation of what the library has to offer and how you can use it to interact with NumPy, and goes on to summarize topics in linear algebra, such as matrix creation, matrix functions, basic routines that you can perform with matrices, and matrix decompositions from scipy.linalg. Sparse matrices are also included, with their own routines, functions, and decompositions from the scipy.sparse module.  (Above is the printable version of this cheat sheet) :target:before { content:""; display:block; height:150px; margin:-150px 0 0; } h3 {font-weight:normal; } h4 { font-weight: lighter; } table { width: 100%; table-layout: fixed; } th { height: 50px; } th, td { padding: 5px; text-align: left;} tr {background-color:white} tr:hover {background-color: #f5f5f5} Python for Data-Science Cheat Sheet: SciPy - Linear Algebra SciPy The SciPy library is one of the core packages for scientific computing that provides mathematical algorithms and convenience functions built on the NumPy extension of Python. Asking For Help >>> help(scipy.linalg.diagsvd) Interacting With NumPy >>> import numpy as np >>> a : np.array([1,2,3]) >>> b : np.array([(1+5j,2j,3j), (4j,5j,6j)]) >>> c : np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]]) Index Tricks >>> np.mgrid[0:5,0:5] Create a dense meshgrid >>> np.ogrid[0:2,0:2] >>> np.r_[3,[0]*5,-1:1:10j] Stack arrays vertically (row-wise) >>> np.c_[b,c] Shape Manipulation >>> np.transpose(b) Permute array dimensions >>> b.flatten() Flatten the array >>> np.hstack((b,c)) Stack arrays horizontally (column-wise) >>> np.vstack((a,b)) Stack arrays vertically (row-wise) >>> np.hsplit(c,2) Split the array horizontally at the 2nd index >>> np.vpslit(d,2) Polynomials >>> from numpy imp[...]



Python Engineering at Microsoft: Interactive Windows in VS 2017

Wed, 22 Mar 2017 17:00:36 +0000

Last week we announced that the Python development workload is available now in Visual Studio Preview, and briefly covered some of the new improvements in Visual Studio 2017. In this post, we are going to go into more depth on the improvements for the Python Interactive Window. These are currently available in Visual Studio Preview, and will become available in one of the next updates to the stable release. Over the lifetime of Visual Studio 2017 we will have opportunities to further improve these features, so please provide feedback and suggestions at our GitHub site. Interactive Windows People who have been using Visual Studio with many versions of Python installed will be used to seeing a long list of interactive windows – one for each version of Python. Selecting any one of these would let you run short snippets of code with that version and see the results immediately. However, because we only allowed one window for each, there was no way to open multiple windows for the same Python version and try different things. In Visual Studio 2017, the main menu has been simplified to only include a single entry. Selecting this entry (or using the Alt+I keyboard shortcut) will open an interactive window with some new toolbar items: At the right hand side, you’ll see the new “Environment” dropdown. With this field, you can select any version of Python you have installed and the interactive window will switch to it. This will reset your current state (after prompting), but will keep your history and previous output. The button at the left hand side of the toolbar creates a new interactive window. Each window is independent from each other: they do not share any state, history or output, can use the same or different versions of Python, and may be arranged however you like. This flexibility will allow you to try two different pieces of code in the same version of Python, viewing the results side-by-side, without having to repeatedly reset everything. Code Cells One workflow that we see people using very successfully is what we internally call the scratchpad. In this approach, you have a Python script that contains many little code snippets that you can copy-paste from the editor into an interactive window. Typically you don’t run the script in its entirety, as the code snippets may be completely unrelated. For example, you might have a “scratchpad.py” file with a range of your favorite matplotlib or Bokeh plot commands. Previously, we provided a command to send selected text to an interactive window (press Ctrl+E twice) to easily copy code from an editor window. This command still exists, but has been enhanced in Visual Studio 2017 in the following ways. We’ve added Ctrl+Enter as a new keyboard shortcut for this command, which will help more people use muscle memory that they may have developed in other tools. So if you are comfortable with pressing Ctrl+E twice, you can keep doing that, or you can switch to the more convenient Ctrl+Enter shortcut. We have also made the shortcut work when you don’t have any code selected. (This is the complicated bit, but we’ve found that it makes the most sense when you start using it.) In the normal case, pressing Ctrl+Enter without a selection will send the current line of text and move the caret to the following line. We do not try and figure out whether it is a complete statement or not, so you might send an invalid statement, though in this case we won’t try and execute it straight away. As you send more lines of code (by pressing Ctrl+En[...]



NumFOCUS: nteract: Building on top of Jupyter (from a rich REPL toolkit to interactive notebooks)

Wed, 22 Mar 2017 16:29:01 +0000

This post originally appeared on the nteract blog. Blueprint for nteract nteract builds upon the very successful foundations of Jupyter. I think of Jupyter as a brilliantly rich REPL toolkit. A typical REPL (Read-Eval-Print-Loop) is an interpreter that takes input from the user and prints results (on stdout and stderr).​Here’s the standard Python interpreter; a REPL many of us know and love. Standard Python interpreter The standard terminal’s spartan user interface, while useful, leaves something to be desired. IPython was created in 2001 to refine the interpreter, primarily by extending display hooks in Python. Iterative improvement on the interpreter was a big boon for interactive computing experiences, especially in the sciences. IPython terminal As the team behind IPython evolved, so did their ambitions to create richer consoles and notebooks. Core to this was crafting the building blocks of the protocol that were established on top of ZeroMQ, leading to the creation of the IPython notebook. It decoupled the REPL from a closed loop in one system to multiple components communicating together. IP[y]thon Notebook As IPython came to embrace more than just Python (R, Julia, Node, Scala, …), the IPython leads created a home for the language agnostic parts: Jupyter. Jupyter Notebook Classic Edition Jupyter isn’t just a notebook or a console. It’s an establishment of well-defined protocols and formats. It’s a community of people who come together to build interactive computing experiences. We share our knowledge across the sciences, academia, and industry — there’s a lot of overlap in vision, goals, and priorities.That being said, one project alone may not meet with everyone’s specific needs and workflows. Luckily, with strong support by Jupyter’s solid foundation of protocols to communicate with the interpreters (Jupyter kernels) and document formats (e.g. .ipynb), you too can build your ideal interactive computing environment.In pursuit of this, members of the Jupyter community created nteract, a Jupyter notebook desktop application as well as an ecosystem of JavaScript packages to support it and more. What is the platform that Jupyter provides to build rich interactive experiences? To explore this, I will describe the Jupyter protocol with a lightweight (non-compliant) version of the protocol that hopefully helps explain how this works under the hood. Also a lightweight Hello WorldWhen a user runs this code, a message is formed: We send that message and receive replies as JSON: We’ve received two types of messages so far:execution status for the interpreter — busy or idlea “stream” of stdoutThe status tells us the interpreter is ready for more and the stream data is shown below the editor in the output area of a notebook. What happens when a longer computation runs? Sleepy time printing As multiple outputs come in, they get appended to the display area below the code editor. How are tables, plots, and other rich media shown? Yay for DataFrames! Let’s send that code over to see The power and simplicity of the protocol emerges when using the execute_result and display_data message types. They both have a data field with multiple media types for the frontend to choose how to represent. Pandas provides te[...]



Dataquest: Turbocharge Your Data Acquisition using the data.world Python Library

Wed, 22 Mar 2017 15:00:00 +0000

When working with data, a key part of your workflow is finding and importing data sets. Being able to quickly locate data, understand it and combine it with other sources can be difficult.

One tool to help with this is data.world, where you can search for, copy, analyze, and download data sets. In addition, you can upload your data to data.world and use it to collaborate with others.

(image)

In this tutorial, we’re going to show you how to use data.world’s Python library to easily work with data from your python scripts or Jupyter notebooks. You’ll need to create a free data.world account to view the data set and follow along.

The data.world python library allows you to bring data that’s stored in a data.world data set straight into your workflow, without having to first download the data locally and transform it into a format you require.

Because data sets in data.world are stored in the format that the user originally uploaded them in, you often find great data sets that exist in a less than ideal, format, such as multiple sheets of an Excel workbook, where...




DataCamp: New Python Course: Network Analysis

Wed, 22 Mar 2017 13:14:49 +0000

Hi Pythonistas! Today we're launching Network Analysis in Python by Eric Ma!

From online social networks such as Facebook and Twitter to transportation networks such as bike sharing systems, networks are everywhere, and knowing how to analyze this type of data will open up a new world of possibilities for you as a Data Scientist. This course will equip you with the skills to analyze, visualize, and make sense of networks. You'll apply the concepts you learn to real-world network data using the powerful NetworkX library. With the knowledge gained in this course, you'll develop your network thinking skills and be able to start looking at your data with a fresh perspective!

Start for free

Python: Network Analysis features interactive exercises that combine high-quality video, in-browser coding, and gamification for an engaging learning experience that will make you a master network analysis in python!

(image)

What you'll learn:

In the first chapter, you'll be introduced to fundamental concepts in network analytics while becoming acquainted with a real-world Twitter network dataset that you will explore throughout the course. In addition, you'll learn about NetworkX, a library that allows you to manipulate, analyze, and model graph data. You'll learn about different types of graphs as well as how to rationally visualize them. Start first chapter for free.

In chapter 2, you'll learn about ways of identifying nodes that are important in a network. In doing so, you'll be introduced to more advanced concepts in network analysis as well as learn the basics of path-finding algorithms. The chapter concludes with a deep dive into the Twitter network dataset which will reinforce the concepts you've learned, such as degree centrality and betweenness centrality.

Chapter 3 is all about finding interesting structures within network data. You'll learn about essential concepts such as cliques, communities, and subgraphs, which will leverage all of the skills you acquired in Chapter 2. By the end of this chapter, you'll be ready to apply the concepts you've learned to a real-world case study.

In the final chapter of the course, you'll consolidate everything you've learned by diving into an in-depth case study of GitHub collaborator network data. This is a great example of real-world social network data, and your newly acquired skills will be fully tested. By the end of this chapter, you'll have developed your very own recommendation system which suggests GitHub users who should collaborate together. Enjoy!

Start course for free




PyBites: Best Practices for Compatible Python 2 and 3 Code

Wed, 22 Mar 2017 11:42:00 +0000

95% of most popular Python packages support Python 3. Maybe you are lucky and get to start fresh using Python 3. However as of last year Python 2.7 still reigns supreme in pip installs and at a lot of places 2.x is the only version you get to work in. I think writing Python 2 and 3 compatible code is an important skill, so lets check what it entails.




PyCharm: Why Postgres Should be your Document Database Webinar Recording

Wed, 22 Mar 2017 11:19:36 +0000

This Monday Jim Fulton, one of the first Python contributors, hosted a webinar about storing JSONB documents in PostgreSQL. Watch it now:

Known mostly for its mature SQL and data-at-scale infrastructure, the PostgreSQL project added a “JSONB” column type in its 9.4 release, then refined it over the next two releases. While using it is straightforward, combining it in hybrid structured/unstructured applications along with other facilities in the database can require skill.

In this webinar, Python and database consultant Jim Fulton shows us how to use JSONB and related machinery for pure and hybrid Python document-oriented applications. We also briefly discuss his long history back to the start of Python, and finish with his unique NewtDB library for native Python objects coupled to JSONB queries.

Jim uses PyCharm Professional during the webinar. PyCharm Professional bundles the database tools from JetBrains DataGrip, our database IDE. However, the webinar itself is focused on the concepts of JSONB.

You can find Jim’s code on GitHub: https://github.com/jimfulton/pycharm-170320

If you have any questions or comments about the webinar, feel free to leave them in the comments below, or you can reach us on Twitter. Jim is on Twitter as well, his Twitter handle is @j1mfulton.

-PyCharm Team
The Drive to Develop

(image)



Simple is Better Than Complex: Ask Vitor #2: How to dynamically filter ModelChoice's queryset in a ModelForm?

Wed, 22 Mar 2017 11:08:00 +0000

Michał Strumecki asks: I just want to filter select field in a form, regarding a currently logged user. Every user has own categories and budgets. I want to display only a models related with a currently logged user. I’ve tried stuff with filtering before is_valid field, but with no result. Answer This is a very common use case when dealing with ModelForms. The problem is that in the form fields ModelChoice and ModelMultipleChoiceField, which are used respectively for the model fields ForeignKey and ManyToManyField, it defaults the queryset to the Model.objects.all(). If the filtering was static, you could simply pass a filtered queryset in the form definition, like Model.objects.filter(status='pending'). When the filtering parameter is dynamic, we need to do a few tweaks in the form to get the right queryset. Let’s simplify the scenario a little bit. We have the Django User model, a Category model and Product model. Now let’s say it’s a multi-user application. And each user can only see the products they create, and naturally only use the categories they own. models.py from django.contrib.auth.models import User from django.db import models class Category(models.Model): name = models.CharField(max_length=30) user = models.ForeignKey(User, on_delete=models.CASCADE) class Product(models.Model): name = models.CharField(max_length=30) price = models.DecimalField(decimal_places=2, max_digits=10) category = models.ForeignKey(Category) user = models.ForeignKey(User, on_delete=models.CASCADE) Here is how we can create a ModelForm for the Product model, using only the currently logged-in user: forms.py from django import forms from .models import Category, Product class ProductForm(forms.ModelForm): class Meta: model = Product fields = ('name', 'price', 'category', ) def __init__(self, user, *args, **kwargs): super(ProductForm, self).__init__(*args, **kwargs) self.fields['category'].queryset = Category.objects.filter(user=user) That means now the ProductForm has a mandatory parameter in its constructor. So, instead of initializing the form as form = ProductForm(), you need to pass a user instance: form = ProductForm(user). Here is a working example of view handling this form: views.py from django.shortcuts import render, redirect from .forms import ProductForm @login_required def new_product(request): if request.method == 'POST': form = ProductForm(request.user, request.POST) if form.is_valid(): product = form.save(commit=False) product.user = request.user product.save() return redirect('products_list') else: form = ProductForm(request.user) return render(request, 'products/product_form.html', {'form': form}) Using ModelFormSet The machinery behind the modelformset_factory is not very flexible, so we can’t add extra parameters in the form constructor. But we certainly can play with the available resources. The difference here is that we will need to change the queryset on the fly. Here is what we can do: models.py @login_required def edit_all_products(request): ProductFormSet = modelformset_factory(Product, fields=('name', 'price', 'category'), extra=0) data = request.POST or None formset = ProductFormSet(data=data, queryset=Product.objects.filter(user=request.user)) for form in f[...]