Subscribe: Planet Python
http://www.planetpython.org/rss20.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
code  dask  data  dict  install  jupyter  key  make  new  passphrase  pep  python  server  ssh  time  types  word  work 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Planet Python

Planet Python



Planet Python - http://planetpython.org/



 



Mike Driscoll: Python 3 – Unpacking Generalizations

Tue, 21 Feb 2017 18:15:13 +0000

Python 3.5 added more support for Unpacking Generalizations in PEP 448. According to the PEP, it added extended usages of the * iterable unpacking operator and ** dictionary unpacking operators to allow unpacking in more positions, an arbitrary number of times, and in additional circumstances. What this means is that we can now make calls to functions with an arbitrary number of unpackings. Let’s take a look at a dict() example: >>> my_dict = {'1':'one', '2':'two'} >>> dict(**my_dict, w=6) {'1': 'one', '2': 'two', 'w': 6} >>> dict(**my_dict, w='three', **{'4':'four'}) {'1': 'one', '2': 'two', 'w': 'three', '4': 'four'} Interestingly, if the keys are something other then strings, the unpacking doesn’t work: >>> my_dict = {1:'one', 2:'two'} >>> dict(**my_dict) Traceback (most recent call last): File "", line 1, in dict(**my_dict) TypeError: keyword arguments must be strings Update: One of my readers was quick to point out that the reason this doesn’t work is because I was trying to unpack into a function call (i.e. dict()). If I had done the unpacking using just dict syntax, the integer keys would have worked fine. Here’s what I’m talking about: >>> {**{1: 'one', 2:'two'}, 3:'three'} {1: 'one', 2: 'two', 3: 'three'} One other interesting wrinkle to dict unpacking is that later values will always override earlier ones. There’s a good example in the PEP that demonstrates this: >>> {**{'x': 2}, 'x': 1} {'x': 1} I thought that was pretty neat. You can do the same sort of thing with ChainMap from the collections module, but this is quite a bit simpler. However this new unpacking also works for tuples and lists. Let’s try combining some items of different types into one list: >>> my_tuple = (11, 12, 45) >>> my_list = ['something', 'or', 'other'] >>> my_range = range(5) >>> combo = [*my_tuple, *my_list, *my_range] >>> combo [11, 12, 45, 'something', 'or', 'other', 0, 1, 2, 3, 4] Before this unpacking change, you would have had do something like this: >>> combo = list(my_tuple) + list(my_list) + list(my_range) [11, 12, 45, 'something', 'or', 'other', 0, 1, 2, 3, 4] I think the new syntax is actually quite handy for these kinds of circumstances. I’ve actually run into this a time or two in Python 2 where this new enhancement would have been quite useful. Wrapping Up There are lots of other examples in PEP 448 that are quite interesting to read about and try in Python’s interpreter. I highly recommend checking it out and giving this feature a try. I am hoping to start using some of these features in my new code whenever we finally move to Python 3. [...]



DataCamp: Free DataCamp for your Classroom

Tue, 21 Feb 2017 14:22:02 +0000

Announcing: DataCamp for the classroom, a new free plan for Academics. We want to support every student that wants to learn Data Science. That is why, as of today, professors/teachers/TA’s/… can give their students 6 months of FREE access to the full DataCamp course curriculum when used in the classroom. Request your free classroom today. 1) Student Benefits When you set-up your DataCamp for the classroom account, each student will automatically have full access to the entire course curriculum (>250 hours). This includes access to premium courses such as: Introduction to R and Python for Data Science Writing functions in R (Hadley Wickham, RStudio) Introduction to Data Visualization with Python (Bryan Van de Ven, Bokeh) pandas Foundation (Continuum Analytics) and more. (See full course curriculum) In addition, students can participate in leaderboards and private discussion forums with their fellow classmates.  2) Professor/Instructors/Teacher/... Benefits Via DataCamp for the classroom, the instructor has access to all our handy group features:  Progress Tracking: Track each of your student’s progress in detail. You can see for example how many XP points they have gained, the number of courses completed and total chapters finished. Assignments & Deadlines: Save yourself time via automated grading. Automatically set homework using assignment and deadlines throughout the semester.  Export Data: Get all of your student’s data via the export functionality. See in depth how active students are on DataCamp, when they started and completed chapters, and use this data to make your own analysis. Learning Management System Integration: Seamlessly integrate DataCamp with your universities learning management system (Sakai, Blackboard, Canvas…). Automatically send student assignment completion data where it is expected to be found. 3) Get Started Get started with DataCamp for the classroom and join professors from Duke University, Notre-Dame, Harvard, University College London, UC Berkeley and much more, in learning data science by doing with DataCamp. START TODAY [...]



DataCamp: Matplotlib Cheat Sheet: Plotting in Python

Tue, 21 Feb 2017 13:49:30 +0000

Data visualization and storytelling with your data are essential skills that every data scientist needs to communicate insights gained from analyses effectively to any audience out there. 

For most beginners, the first package that they use to get in touch with data visualization and storytelling is, naturally, Matplotlib: it is a Python 2D plotting library that enables users to make publication-quality figures. But, what might be even more convincing is the fact that other packages, such as Pandas, intend to build more plotting integration with Matplotlib as time goes on.

However, what might slow down beginners is the fact that this package is pretty extensive. There is so much that you can do with it and it might be hard to still keep a structure when you're learning how to work with Matplotlib.   

DataCamp has created a Matplotlib cheat sheet for those who might already know how to use the package to their advantage to make beautiful plots in Python, but that still want to keep a one-page reference handy. Of course, for those who don't know how to work with Matplotlib, this might be the extra push be convinced and to finally get started with data visualization in Python. 

You'll see that this cheat sheet presents you with the six basic steps that you can go through to make beautiful plots. 

Check out the infographic by clicking on the button below:

(image)

With this handy reference, you'll familiarize yourself in no time with the basics of Matplotlib: you'll learn how you can prepare your data, create a new plot, use some basic plotting routines to your advantage, add customizations to your plots, and save, show and close the plots that you make.

What might have looked difficult before will definitely be more clear once you start using this cheat sheet! 

Also, don't miss out on our other cheat sheets for data science that cover SciPyNumpyScikit-LearnBokehPandas and the Python basics.




Chris Moffitt: Populating MS Word Templates with Python

Tue, 21 Feb 2017 13:25:00 +0000

Introduction In a previous post, I covered one approach for generating documents using HTML templates to create a PDF. While PDF is great, the world still relies on Microsoft Word for document creation. In reality, it will be much simpler for a business user to create the desired template that supports all the custom formatting they need in Word versus trying to use HTML+CSS. Fortunately, there is a a package that supports doing a MS Word mailmerge purely within python. This approach has the advantage of running on any system - even if Word is not installed. The benefit to using python for the merge (vs. an Excel sheet) is that you are not limited in how you retrieve or process the data. The full flexibility and power of the python ecosystem is at your finger tips. This should be a useful tool to keep in mind any time you need to automate document creation. Background The package that makes all of this possible is fittingly called docx-mailmerge. It is a mature package that can parse the MS Word docx file, find the merge fields and populate them with whatever values you need. The package also support some helper functions for populating tables and generating single files with multiple page breaks. The one comment I have about this package is that using the term “mailmerge” evokes a very simple use case - populating multiple documents with mailing addresses. I know that the standard Word approach is to call this process a mailmerge but this “mailmerge” can be a useful templating system that can be used for a lot more sophisticated solution than just populating names and addresses in a document. Installation The package requires lxml which has platform specific binary installs. I recommend using conda to install lxml and the dependencies then using pip for the mailmerge package itself. I tested this on linux and Windows and seems to work fine on both platforms. conda install lxml pip install docx-mailmerge That’s it. Before we show how to populate the Word fields, let’s walk through creating the Word document. Word Merge Fields In order for docx-mailmerge to work correctly, you need to create a standard Word document and define the appropriate merge fields. The examples below are for Word 2010. Other versions of Word should be similar. It actually took me a while to figure out this process but once you do it a couple of times, it is pretty simple. Start Word and create the basic document structure. Then place the cursor in the location where the merged data should be inserted and choose Insert -> Quick Parts -> Field..: From the Field dialog box, select the “MergeField” option from the Field Names list. In the Field Name, enter the name you want for the field. In this case, we are using Business Name. Once you click ok, you should see something like this: <> in the Word document. You can go ahead and create the document with all the needed fields. Simple Merge Once you have the Word document created, merging the values is a simple operation. The code below contains the standard imports and defines the name of the Word file. In most cases, you will need to include the full path to the template but for simplicity, I am assuming it is in the same directory as your python scripts: from __future__ import print_function from mailmerge import MailMerge from datetime import date template = "Practical-Business-Python.docx" To create a mailmerge document and look at all of the fields: document = MailMerge(template) print(document.get_merge_fields()) {'purchases', 'Business', 'address', 'discount', 'recipient', 'date', 'zip', 'status', 'phone_number', 'city', 'shipping_limit', 'state'} To merge in the values and save the results, use document.merge with all of the variables assigned a value and document.write to save the output: document.merge( status='Gold', city='Springfield', phone_number='800-555-5555', Business='Cool Shoes', zip='55555', p[...]



Dataquest: Pandas Cheat Sheet - Python for Data Science

Tue, 21 Feb 2017 10:00:00 +0000

Pandas is arguably the most important Python package for data science. Not only does it give you lots of methods and functions that make working with data easier, but it has been optimized for speed which gives you a significant advantage compared with working with numeric data using Python’s built-in functions.

(image)

The printable version of this cheat sheet

It’s common when first learning pandas to have trouble remembering all the functions and methods that you need, and while at Dataquest we advocate getting used to consulting the pandas documentation, sometimes it’s nice to have a handy reference, so we’ve put together this cheat sheet to help you out!

If you’re interested in learning pandas, you can consult our two-part pandas tutorial blog post, or you can signup for free and start learning pandas through our interactive pandas for data science course.

Key and Imports

In this cheat sheet, we use the following shorthand:

df Any pandas DataFrame object
s Any pandas Series...



S. Lott: Intro to Python CSV Processing for Actual Beginners

Tue, 21 Feb 2017 07:37:04 +0000

I've written a lot about CSV processing. Here are some examples http://slott-softwarearchitect.blogspot.com/search/label/csv.It crops up in my books. A lot.In all cases, though, I make the implicit assumption that my readers already know a lot of Python. This is a disservice to anyone who's getting started.Getting StartedYou'll need Python 3.6. Nothing else will do if you're starting out.Go to https://www.continuum.io/downloads and get Python 3.6. You can get the small "miniconda" version to start with. It has some of what you'll need to hack around with CSV files. The full Anaconda version contains a mountain of cool stuff, but it's a big download.Once you have Python installed, what next? To be sure things are running do this:Find a command line prompt (terminal window, cmd.exe, whatever it's called on your OS.)Enter python3.6 (or just python in Windows.)If Anaconda installed everything properly, you'll have an interaction that looks like this:MacBookPro-SLott:Python2v3 slott$ python3.5Python 3.5.1 (v3.5.1:37a07cee5969, Dec  5 2015, 21:12:44) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> More-or-less. (Yes, the example shows 3.5.1 even though I said you should get 3.6. As soon as the Lynda.com course drops, I'll upgrade. The differences between 3.5 and 3.6 are almost invisible.)Here's your first interaction.>>> 355/1133.1415929203539825Yep. Python did math. Stuff is happening.Here's some more.>>> exitUse exit() or Ctrl-D (i.e. EOF) to exit>>> exit()Okay. That was fun. But it's not data wrangling. When do we get to the good stuff?To Script or Not To ScriptWe have two paths when it comes to scripting. You can write script files and run them. This is pretty normal application development stuff. It works well. Or.You can use a Jupyter Notebook. This isn't exactly a script. But. You can use it like a script. It's a good place to start building some code that's useful. You can rerun some (or all) of the notebook to make it script-like.If you downloaded Anaconda, you have Jupyter. Done. Skip over the next part on installing Jupyter.Installing JupyterIf you did not download the full Anaconda -- perhaps because you used the miniconda -- you'll need to add Jupyter.  You can use the command conda install jupyter for this.Another choice is to use the PIP program to install jupyter. The net effect is the same. It starts like thisMacBookPro-SLott:Python2v3 slott$ pip3 install jupyterCollecting jupyter  Downloading jupyter-1.0.0-py2.py3-none-any.whlCollecting ipykernel (from jupyter)  Downloading ipykernel-4.5.2-py2.py3-none-any.whl (98kB)    100% |████████████████████████████████| 102kB 1.3MB/s It ends like this.  Downloading pyparsing-2.1.10-py2.py3-none-any.whl (56kB)    100% |████████████████████████████████| 61kB 2.1MB/s Installing collected packages: ipython-genutils, decorator, traitlets, appnope, appdirs, pyparsing, packaging, setuptools, ptyprocess, pexpect, simplegeneric, wcwidth, prompt-toolkit, pickleshare, ipython, jupyter-core, pyzmq, jupyter-client, tornado, ipykernel, qtconsole, terminado, nbformat, entrypoints, mistune, pandocfilters, testpath, bleach, nbconvert, notebook, widgetsnbextension, ipywidgets, jupyter-console, jupyter  Found existing installation: setuptools 18.2    Uninstalling setuptools-18.2:      Successfully uninstalled setuptools-18.2  Running setup.py install for simplegeneric ... done  Running setup.py install for tornado ... done  Running setup.py install for terminado ... done  Running setup.py install for pandocfilters ... doneSuccessfully installed appdirs-1.4.0 appnope-0.1.0 bleach-1.5.0 dec[...]



Gocept Weblog: Zope at the turnpike of the Python 3 wonderland

Tue, 21 Feb 2017 06:46:28 +0000

A little tale

Once upon the time there was an earl named Zope II. He lived happily in a land called Python 2. Since some years there where rumours that a huge disaster would hit the country. The people ironically used to call it “sunset”. Prophets arose and said that 2020 would be the year where this disaster would finally happen.

Zope II got anxious about his future and the future of his descendants. But there were some brave people who liked Zope II and told him of the Python 3 wonderland. A land of eternal joy and happiness without “sunset” disasters and with no problems at all. It seemed like a dream to Zope II – too nice to be true?

After some research it became clear that the Python 3 wonderland was real – not completely as advertised by the people but nice to live in. So Zope II set the goal to settle down in the Python 3 wonderland before the “sunset” disaster would happen.

But this was not as easy as it seemed to be. The immigrant authority told Zope II that he is “not compatible” with the Python 3 wonderland and that he needs “to be ported” to be able to breath the special Python 3 air.

As if this was not enough: As an earl Zope II was not able to migrate without his staff, the many people he relied on for his daily procedures. Some of them have been found to be “ported” already thus they were already “compatible” for the new country. But there was one old, but very important servant of Zope II named RestrictedPython. Zope II could not live without him – but he got told that he never will be “compatible”. The authority even required a “complete rewrite”.

The Python 3 wonderland seemed so near but is was so difficult to reach. But there where so many people who helped Zope II and encouraged him not to give up. Eventually it seemed to be possible for Zope II to get beyond the turnpike into the wonderful land called Python 3.

Back in reality

We are the people who like Zope II and help him to get past the turnpike into Python 3. Since the Alpine City Sprint earlier this month RestrictedPython is no longer a blocker for other packages depending on it to be ported.

Come, join us porting the remaining dependencies of Zope and Zope itself to Python 3. There will be a Zope 2 Resurrection Sprint in May this year in Halle (Saale), Germany at gocept. Join us on site or remote.


(image) (image)



Daniel Bader: Sublime Text Settings for Writing Clean Python

Tue, 21 Feb 2017 00:00:00 +0000

Sublime Text Settings for Writing Clean Python How to write beautiful and clean Python by tweaking your Sublime Text settings so that they make it easier to adhere to the PEP 8 style guide recommendations. There are a few settings you can change to make it easier for you to write PEP 8 compliant Python with Sublime Text 3. PEP 8 is the most common Python style guide and widely used in the Python community. The tweaks I describe in this article mainly deal with getting the placement of whitespace correct so that you don’t have to manage this (boring) aspect yourself. I’ll also show you how to get visual indicators for the maximum allowed line-lengths in your editor window so that your lines can be concise and beautifully PEP 8 compliant—just like Guido wants them to be 🙂 Optional: Opening Sublime’s Syntax-Specific Settings for Python The settings we’re changing now are specific to Python. Feel free to place them in your User settings, that will work just fine. However if you’d like to apply some or all of the settings in this chapter only to Python code then here’s how you can do that: Open a Python file in Sublime Text (or create a new file, open the Command Palette and execute the “Set Syntax: Python” command) Click on Sublime Text → Preferences → Settings – More → Syntax Specific – User to open your Python-specific user settings. Make sure this opens a new editor tab called Python.sublime-settings. That’s the one you want! If you’d like like to learn more about how Sublime Text’s preferences system works, then check out this tutorial I wrote. Better Whitespace Handling The following changes you can make to your (Syntax Specific) User Settings will help you keep the whitespace in your Python code clean and consistent: "tab_size": 4, "translate_tabs_to_spaces": true, "trim_trailing_white_space_on_save": true, "ensure_newline_at_eof_on_save": true A tab_size of 4 is the general recommendation for writing Python. You’ll also want to enable translate_tabs_to_spaces to ensure that you don’t have a mixture of tabs and spaces in your Python files, which should be avoided. The trim_trailing_white_space_on_save option will remove superfluous whitespace at the end of lines or on empty lines. I highly recommend enabling this because it can save headaches and merge conflicts when working with Git and other forms of source control. PEP 8 recommends that Python files should end with a blank line to ensure that POSIX tools can process the file correctly. If you want to never have to worry about this again then turn on the ensure_newline_at_eof_on_save setting as this will make sure that your Python files end with a newline automatically. Enable PEP 8 Line-Length Indicators Another setting that’s really handy for writing PEP 8 compliant code is the “rulers” feature. It enables visual indicators in the editor area that show you the preferred maximum line length. You can enable several rulers with different line lengths at the same time. This helps you follow the PEP 8 recommendations of limiting your docstrings to 72 characters and limiting all other lines to 79 characters. Here’s how to set up the rulers feature for Python development. Open your (Syntax Specific) User Settings and add the following setting: "rulers": [ 72, 79 ] This will add two line-length indicators—one at 72 characters for docstrings, and one at 79 characters for regular lines. You can see them in the screenshot as vertical lines on the right-hand side of the editor area. Turn On Word Wrapping I like enabling Sublime’s word-wrapping feature when I’m writing Python. Most of my projects follow the PEP 8 style guide and therefore use a maximum line length of 79 characters. I don’t want to get in[...]



Django Weblog: Django 1.11 beta 1 released

Mon, 20 Feb 2017 23:27:02 +0000

Django 1.11 beta 1 is an opportunity for you to try out the medley of new features in Django 1.11.

Only bugs in new features and regressions from earlier versions of Django will be fixed between now and 1.11 final (also, translations will be updated following the "string freeze" when the release candidate is issued). The current release schedule calls for a release candidate about a month from now with the final release to follow about two weeks after that around April 1. We'll only be able to keep this schedule if we get early and often testing from the community. Updates on the release schedule schedule are available on the django-developers mailing list.

As with all alpha and beta packages, this is not for production use. But if you'd like to take some of the new features for a spin, or to help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the beta package from our downloads page or on PyPI.

The PGP key ID used for this release is Tim Graham: 1E8ABDC773EDE252.




Coding Diet: Flask and Pytest coverage

Mon, 20 Feb 2017 16:27:27 +0000

I have written before about Flask and obtaining test coverage results here and with an update here. This is pretty trivial if you're writing unit tests that directly call the application, but if you actually want to write tests which animate a browser, for example with selenium, then it's a little more complicated, because the browser/test code has to run concurrently with the server code. Previously I would have the Flask server run in a separate process and run 'coverage' over that process. This was slightly unsatisfying, partly because you sometimes want coverage analysis of your actual tests. Test suites, just like application code, can grow in size with many utility functions and imports etc. which may eventually end up not actually being used. So it is good to know that you're not needlessly maintaining some test code which is not actually invoked. We could probably get around this restriction by running coverage in both the server process and the test-runner's process and combine the results (or simply view them separately). However, this was unsatisfying simply because it felt like something that should not be necessary. Today I spent a bit of time setting up the scheme to test a Flask application without the need for a separate process. I solved this now, by not using Flask's included Werkzeug server and instead using the WSGI server included in the standard-library wsgiref.simple_server module. Here is, a minimal example: import flask class Configuration(object): TEST_SERVER_PORT = 5001 application = flask.Flask(__name__) application.config.from_object(Configuration) @application.route("/") def frontpage(): if False: pass # Should not be covered else: return 'I am the lizard queen!' # Should be in coverage. # Now for some testing. from selenium import webdriver from selenium.webdriver.common.action_chains import ActionChains import pytest # Currently just used for the temporary hack to quit the phantomjs process # see below in quit_driver. import signal import threading import wsgiref.simple_server class ServerThread(threading.Thread): def setup(self): application.config['TESTING'] = True self.port = application.config['TEST_SERVER_PORT'] def run(self): self.httpd = wsgiref.simple_server.make_server('localhost', self.port, application) self.httpd.serve_forever() def stop(self): self.httpd.shutdown() class BrowserClient(object): """Interacts with a running instance of the application via animating a browser.""" def __init__(self, browser="phantom"): driver_class = { 'phantom': webdriver.PhantomJS, 'chrome': webdriver.Chrome, 'firefox': webdriver.Firefox }.get(browser) self.driver = driver_class() self.driver.set_window_size(1200, 760) def finalise(self): self.driver.close() # A bit of hack this but currently there is some bug I believe in # the phantomjs code rather than selenium, but in any case it means that # the phantomjs process is not being killed so we do so explicitly here # for the time being. Obviously we can remove this when that bug is # fixed. See: https://github.com/SeleniumHQ/selenium/issues/767 self.driver.service.process.send_signal(signal.SIGTERM) self.driver.quit() def log_current_page(self, message=None, output_basename=None): content = self.driver.page_source # This is frequently what we really care about so I also output it # here as well to make it convenient to inspect (with highlighting). basename = output_basename or 'log-current-page' file_name = basename + '.html' with open(file_name, 'w') as outfile: if message: outfile[...]



GoDjango: Why You Should Pin Your Dependencies by My Mistakes

Mon, 20 Feb 2017 16:00:00 +0000

Have you ever been bitten by not pinning your dependencies in your django project? If not be glad, and come learn from my problems.

Pinning your dependencies is important to solve future unknown issues, better the devil you know and all that.

In this weeks video I talk about 3 times I had issues. They are either not pinning my dependencies, a weird edge case with pinning and python, and not really understanding what I was doing with pinned dependencies.

Why You Should Pin Your Dependencies




Senthil Kumaran: CPython moved to Github

Mon, 20 Feb 2017 15:09:24 +0000

CPython project moved it's source code hosting from self-hosted mercurial repository, at hg.python.org to Git version control system hosted at Github. The new location of python project is http://www.github.com/python/cpython

This is second big version control migration that is happening since I got involved. The first one was when we moved from svn to mercurial. Branches were sub-optimal in svn and we used svn-merge.py to merge across branches. Mercurial helped there and everyone got used to a distributed version control written in python, mercurial. It was interesting for me personally to compare mercurial with the other popular DVCS, git.

Over the years, Github has become popular place for developers to host their projects. They have constantly improved their service offering. Many python developers, got used to git version control system and found it's utility value too.

Two years ago, it was decided that Python will move to Git and Github. The effort was led by Bret Cannon assisted by number of other developers and the migration happened on Feb 10, 2017.

I helped with the migration too and helped with providing tool around converting the hg to git, using the facilities available from hg-git mercurial plugin.

We made use hg-git, and wrote some conversions scripts that could get us to the converted repo as we wanted.

  1. https://github.com/orsenthil/cpython-hg-to-git
  2. https://bitbucket.org/orsenthil/hg-git

Now that the migration is done, we are getting ourselves familiar to the new workflow.

(image)



Rene Dudfield: Is Type Tracing for Python useful? Some experiments.

Mon, 20 Feb 2017 15:01:48 +0000

Type Tracing - as a program runs you trace it and record the types of variables coming in and out of functions, and being assigned to variables.Is Type Tracing useful for providing quality benefits, documentation benefits, porting benefits, and also speed benefits to real python programs?Python is now a gradually typed language, meaning that you can gradually apply types and along with type inference statically check your code is correct. Once you have added types to everything, you can catch quite a lot of errors. For several years I've been using the new type checking tools that have been popping up in the python ecosystem. I've given talks to user groups about them, and also trained people to use them. I think a lot of people are using these tools without even realizing it. They see in their IDE warnings about type issues, and methods are automatically completed for them. But I've always had some thoughts in the back of my head about recording types at runtime of a program in order to help the type inference out (and to avoid having to annotate them manually yourself).Note, that this technique is a different, but related thing to what is done in a tracing jit compiler.Some days ago I decided to try Type Tracing out... and I was quite surprised by the results. I asked myself these questions.Can I store the types coming in and out of python functions, and the types assigned to variables in order to be useful for other things based on tracing the running of a program? (Yes)Can I "Type Trace" a complex program? (Yes, a flask+sqlalchemy app test suite runs)Is porting python 2 code quicker by Type Tracing combined with static type checking, documentation generation, and test generation? (Yes, refactoring is safer with a type checker and no manually written tests)Can I generate better documentation automatically with Type Tracing? (Yes, return and parameter types and example values helps understanding greatly)Can I use the types for automatic property testing? (Yes, hypothesis does useful testing just knowing some types and a few examples... which we recorded with the tracer)Can I use example capture for tests and docs, as well as the types? (Yes)Can I generate faster compiled code automatically just using the recorded types and Cython (Yes). Benefits from Type Tracing.Below I try to show that the following benefits can be obtained by combining Type Tracing with other existing python tools. Automate documentation generation, by providing types to the documentation tool, and by collecting some example inputs and outputs.Automate some type annotation.Automatically find bugs static type checking can not. Without full type inference, existing python static type checkers can not find many issues until the types are fully annotated. Type Tracing can provide those types.Speed up Python2 porting process, by finding issues other tools can't. It can also speed things up by showing people types and example inputs. This can greatly help people understand large programs when documentation is limited.Use for Ahead Of Time (AOT) compilation with Cython.Help property testing tools to find simple bugs without manually setting properties.Tools used to hack something together. coverage (extended the coverage checker to record types as it goes)  mypy (static type checker for python)Hypothesis (property testing... automated test generator)Cython (a compiler for python code, and code with type annotations)jedi (another python static type checker)Sphinx (automatic documentation generator).Cpython (the original C implementation of python)More details below on the experiments.Type Tracing using 'coverage'.Originally I hacked up a set_trace script... and started going. But there really are so many corner cases. Also, I a[...]



Weekly Python Chat: Django Forms

Mon, 20 Feb 2017 15:00:00 +0000

Special guest Kenneth Love is going answer your questions about how to use Django's forms.




Doug Hellmann: uuid — Universally Unique Identifiers — PyMOTW 3

Mon, 20 Feb 2017 14:00:47 +0000

RFC 4122 defines a system for creating universally unique identifiers for resources in a way that does not require a central registrar. UUID values are 128 bits long and, as the reference guide says, “can guarantee uniqueness across space and time.” They are useful for generating identifiers for documents, hosts, application clients, and other situations … Continue reading uuid — Universally Unique Identifiers — PyMOTW 3(image)



Mike Driscoll: PyDev of the Week: Petr Viktorin

Mon, 20 Feb 2017 13:30:40 +0000

This week our PyDev of the Week is Petr Viktorin (@EnCuKou). Petr is the author of PEP 489 — Multi-phase extension module initialization and teaches Python for the local PyLadies in Czech Republic. You can some of what he’s up to via his Github page or on his website. Let’s take some time to get to know Petr better! Can you tell us a little about yourself (hobbies, education, etc): Sure! I’m a Python programmer from Brno, Czech Republic. I studied at the Brno University of Technology, and for my master’s I switched to the University of Eastern Finland. When I’m not programming, I enjoy playing board games with my friends, and sometimes go to an orienteering race (without much success). Why did you start using Python? At the university, I did coursework in languages like C, Java, and Lisp, but then I found Python and got hooked. It fit the way I think about programs, abstracted away most of the boring stuff, and makes it easy to keep the code understandable. After I returned home from the university, I found a community that was starting to form around the language, and that’s probably what keeps me around the language now. What other programming languages do you know and which is your favorite? Since I work with CPython a lot, I code in C – or at least I *read* C regularly. And I’d say C’s my favorite, after Python – they complement each other quite nicely..I can also throw something together in JavaScript. And C++, Java or PHP, though I don’t find much reason to code in those languages any more. Since I finished school, I sadly haven’t made much time to learn new languages. Someday, I’d like to explore Rust more seriously, but I haven’t found a good project for starting that yet. What projects are you working on now? I work at Red Hat, and the main job of our team is to package Python for Fedora and RHEL. The mission is to make sure everything works really great together, so when we succeed, the results of the work are somewhat invisible. My other project is teaching Python. A few years back, and without much teaching experience, I’ve started a beginners’ Python course for the local PyLadies. I’ve spent a lot of time on making the content online and accessible to everyone, and over the years it got picked up in two more cities, and sometimes I find people going through the course from home. Now people are refining the course, and even building new workshops and other courses on top of it. Like any open-source project, it needs some maintenance, and I’m lucky to be able to spend some paid time both teaching and coordinating and improving Czech Python teaching materials. When I find some spare time, I hack on crazy side projects like a minimalistic 3D-printed MicroPython-powered game console. Which Python libraries are your favorite (core or 3rd party)? I’m sure Requests appeared on these interviews before: it’s a great example of how a library should be designed. I also like the pyglet library. It’s an easy way to draw graphics on the screen, and I also use it to introduce people to event-driven programming. Where do you see Python going as a programming language? Strictly as a language, I don’t think Python will evolve too much. It’s already a good way to structure code and express algorithms. There will of course be improvements – especially the async parts are quite new and still have some rough corners – but I’m skeptical about any revolutionary additions. I think most improvements will come to the CPython implementation, not the language itself. I’m hopeful for p[...]



Full Stack Python: Creating SSH Keys on macOS Sierra

Mon, 20 Feb 2017 05:00:00 +0000

Deploying Python applications typically requires SSH keys. An SSH key has both a public and a private key file. You can use the private key to authenticate when syncing remote Git repositories, connect to remote servers and automate your application's deployments via configuration management tools like Ansible. Let's learn how to generate SSH key pairs on macOS Sierra. Generating New Keys Bring up a new terminal window on macOS by going into Applications/Utilities and opening "Terminal". The ssh-keygen command provides an interactive command line interface for generating both the public and private keys. Invoke ssh-keygen with the following -t and -b arguments to ensure we get a 4096 bit RSA key. Note that you must use a key with 2048 or more bits in macOS Sierra or the system will not allow you to connect to servers with it. Optionally, you can also specify your email address with -C (otherwise one will be generated off your current macOS account): ssh-keygen -t rsa -b 4096 -C my.email.address@company.com The first prompt you will see asks where to save the key. However, there are actually two files that will be generated: the public key and the private key. Generating public/private rsa key pair. Enter file in which to save the key (/Users/matt/.ssh/id_rsa): This prompt refers to the private key and whatever you enter will also generate a second file for the public key that has the same name and .pub appended. If you already have a key then specify a new filename. I use many SSH keys so I oftne name them "test-deploy", "prod-deploy", "ci-server" along with a unique project name. Naming is one of those hard computer science problems, so take some time to come up with a system that works for you! Next you will see a prompt for an optional passphrase: Enter passphrase (empty for no passphrase): Whether or not you want a passphrase depends on how you will use the key. The system will ask you for the passphrase whenever you use the SSH key, although macOS can store the passphrase in your system Keychain after the first time you enter it. However, if you are automating deployments with a continuous integration server like Jenkins then you will not want a passphrase. Note that it is impossible to recover a passphrase if it is lost. Keep that passphrase safe and secure because otherwise a completely new key would have to be generated. Enter the passphrase (or just press enter to not have a passphrase) twice. You'll see some output like the following: Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /Users/matt/.ssh/deploy_prod. Your public key has been saved in /Users/matt/.ssh/deploy_prod.pub. The key fingerprint is: SHA256:UnRGH/nzYzxUFS9jjd0wOl1ScFGKgW3pU60sSxGnyHo matthew.makai@gmail.com The key's randomart image is: +---[RSA 4096]----+ | ..+o++**@| | . +.o*O.@=| | . oo*=B.*| | . . =o=+ | | . S E. +oo | | . . . =.| | . o| | | | | +----[SHA256]-----+ Your SSH key is ready to use! What now? Now that you have your public and private keys, I recommend building and deploying some Python web apps such as: Building your first Slack bot Sending picture or video messages via a REST API Dialing outbound phone calls with the Bottle web framework Additional ssh-keygen command resources: SSH keys on macOS Sierra Generating a new SSH key and adding it to the ssh-agent Questions? Contact me via Twitter @fullstackpython or @mattmakai. I'm also on GitHub with the username mattmakai. Something wrong with this post? Fork this pag[...]



Carl Trachte: Filling in Missing Grouping Columns of MSSQL SSRS Report Dumped to Excel

Mon, 20 Feb 2017 00:34:00 +0000

This is another simple but common problem in certain business environments:1) Data are presented via a Microsoft SQL Server Reporting Services report, BUT2) The user wants the data in Excel, and, further, wants to play with it (pivot, etc.) there.  The problem is that the grouping column labels are not in every record, only in the one row that begins the list of records for that group (sanitized screenshot below):But I don't WANT to copy and paste all those groupings for 30,000 records :*-(I had this assignment recently from a remote request.  It took about four rounds of an e-mail exchange to figure out that it really wasn't a data problem, but a formatting one that needed solving.It is possible to do the whole thing in Python.  I did the Excel part by hand in order to get a handle on the data:1) In Excel, delete the extra rows on top of the report leaving just the headers and the data.2) In Excel, select everything on the data page, format the cells correctly by unselecting the Merge Cells and Wraparound options.3) In Excel, at this point you should be able to see if there are extra empty columns as space fillers; delete them.  Save the worksheet as a csv file.4) In a text editor, open your csv file, identify any empty rows, and delete them.  Change column header names as desired.Now the Python part:#!python36"""Doctor csv dump from unmerged celldump of SSRS dump from MSSQL database.Fill in cell gaps where mergedcells had only one grouping valueso that all rows are complete records."""import pprintCOMMA = ','EMPTY = ''INFILE = 'rawdata.csv'OUTFILE = 'canneddumpfixed.csv'ERRORFLAG = 'ERROR!' f = open(INFILE, 'r')headerline = next(f)numbercolumns = len(headerline.split(COMMA))f2 = open(OUTFILE, 'w')# Assume at least one data column on far right.missingvalues = (numbercolumns - 1) * [ERRORFLAG]for linex in f:    print('Processing line {:s} . . .'.format(linex))    splitrecord = linex.split(COMMA)    for slotx in range(0, numbercolumns - 1):        if splitrecord[slotx] != EMPTY:            missingvalues[slotx] = splitrecord[slotx]        else:            splitrecord[slotx] = missingvalues[slotx]    f2.write(COMMA.join(splitrecord))f2.close()print('Finished')At this point you've got your data in csv format - you can open it in Excel and go to work.There may be a free or COTS (commercial off the shelf) utility that does all this somewhere in the Microsoft "ecosystem" (I think that's their fancy enviro-friendly word for vendor-user community) but I don't know of one.Thanks for stopping by. [...]



Matthew Rocklin: Dask Development Log

Mon, 20 Feb 2017 00:00:00 +0000

This work is supported by Continuum Analytics the XDATA Program and the Data Driven Discovery Initiative from the Moore Foundation To increase transparency I’m blogging weekly(ish) about the work done on Dask and related projects during the previous week. This log covers work done between 2017-02-01 and 2017-02-20. Nothing here is ready for production. This blogpost is written in haste, so refined polish should not be expected. Themes of the last couple of weeks: Profiling experiments with Dask-GLM Subsequent graph optimizations, both non-linear fusion and avoiding repeatedly creating new graphs Tensorflow and Keras experiments XGBoost experiments Dask tutorial refactor Google Cloud Storage support Cleanup of Dask + SKLearn project Dask-GLM and iterative algorithms Dask-GLM is currently just a bunch of solvers like Newton, Gradient Descent, BFGS, Proximal Gradient Descent, and ADMM. These are useful in solving problems like logistic regression, but also several others. The mathematical side of this work is mostly done by Chris White and Hussain Sultan at Capital One. We’ve been using this project also to see how Dask can scale out machine learning algorithms. To this end we ran a few benchmarks here: https://github.com/dask/dask-glm/issues/26 . This just generates and solves some random problems, but at larger scales. What we found is that some algorithms, like ADMM perform beautifully, while for others, like gradient descent, scheduler overhead can become a substantial bottleneck at scale. This is mostly just because the actual in-memory NumPy operations are so fast; any sluggishness on Dask’s part becomes very apparent. Here is a profile of gradient descent: Notice all the white space. This is Dask figuring out what to do during different iterations. We’re now working to bring this down to make all of the colored parts of this graph squeeze together better. This will result in general overhead improvements throughout the project. Graph Optimizations - Aggressive Fusion We’re approaching this in two ways: More aggressively fuse tasks together so that there are fewer blocks for the scheduler to think about Avoid repeated work when generating very similar graphs In the first case, Dask already does standard task fusion. For example, if you have the following to tasks: x = f(w) y = g(x) z = h(y) Dask (along with every other compiler-like project since the 1980’s) already turns this into the following: z = h(g(f(w))) What’s tricky with a lot of these mathematical or optimization algorithms though is that they are mostly, but not entirely linear. Consider the following example: y = exp(x) - 1/x Visualized as a node-link diagram, this graph looks like a diamond like the following: o exp(x) - 1/x / \ exp(x) o o 1/x \ / o x Graphs like this generally don’t get fused together because we could compute both exp(x) and 1/x in parallel. However when we’re bound by scheduling overhead and when we have plenty of parallel work to do, we’d prefer to fuse these into a single task, even though we lose some potential parallelism. There is a tradeoff here and we’d like to be able to exchange some parallelism (of which we have a lot) for less overhead. PR here dask/dask #1979 by Erik Welch (Erik has written and maintained most of Dask’s graph optimizations). Graph Optimizations - Structural Sharing Additionally, we no longer make copies of graphs in dask.array. Every collection like a dask.array or dask.dataframe holds onto a[...]



Bhishan Bhandari: Raising and Handling Exceptions in Python – Python Programming Essentials

Sun, 19 Feb 2017 13:31:44 +0000

Brief Introduction Any unexpected events that occur during the execution of a program is known to be an exception. Like everything, exceptions are also objects in python that is either an instance of Exception class or an instance of underlying class derived from the base class Exception. Exceptions may occur due to logical errors in […]



Import Python: Import Python Weekly Issue 112 - Python Programming Videos By MIT, mypy static type checker and more

Sun, 19 Feb 2017 11:44:18 +0000

Worthy Read Introduction to Computer Science and Programming in Python. Video Series from MIT Introduction to Computer Science and Programming in Python is intended for students with little or no programming experience. It aims to provide students with an understanding of the role computation can play in solving problems and to help students, regardless of their major, feel justifiably confident of their ability to write small programs that allow them to accomplish useful goals. The class uses the Python 3.5 programming language. video Whitepaper 3 Ways Our Dev Teams Create Velocity with Multi-System Integrations sponsor Python repository moves to GitHub Python core developer Brett talks about the history the decision to move Python to GitHub core-python memoryview memoryview is a special type that can be used to work with data stored in other data-structures. core python [...]



Jamal Moir: Become a Lord of the Cells and Speed up Your Jupyter Notebook Workflow

Sat, 18 Feb 2017 17:04:47 +0000

Everyone loves a good Jupyter Notebook. Jupyter Notebooks are an insanely convenient environment to rapidly prototype Python scripts and delve into Data Science. They speed up the time from writing code to actually executing it and you can visually see the output for each section you write. I make heavy use Jupyter Notebooks in my […]

The post Become a Lord of the Cells and Speed up Your Jupyter Notebook Workflow appeared first on Data Dependence.







Nicola Iarocci: Python Workload pulled off Visual Studio 2017 RC3

Sat, 18 Feb 2017 09:48:29 +0000

So how do you install the awesome Python Development Tools on the latest Visual Studio 2017 RC? That might seem a stupid question considering that the Data Science and Python Development workload has been available with every Release Candidate so far. You simply select the workload during the installation and you’re done, right? Not quite.

I found out the hard way this morning as I wanted to install VS 2017 RC3 on my development machine and, to my surprise, I could not find Python Development anywhere on the workloads window (which itself is a huge improvement over the VS 2015 install experience, by the way). Easy, I thought, they moved it to some secondary “optional workloads” tab, but a quick scan did not reveal any of that.

Concerned now, I turned to the Oracle of All Things only to find that the Python Workload has been pulled off the Visual Studio 2017 RC3 (January 2017). It was actually reported in the release notes:

Removed the Data Science and Python Development workloads as some of the components weren’t meeting the release requirements, such as translation to non-English languages. They will be available soon as separate downloads.

When I glanced over them I (and probably you too) did not notice this little paragraph. But wait, it’s even worse than you would expect:

Upgrading to current version will remove any previously installed Python and Data Science workloads/components.

That’s right. If you upgrade to RC3 you win a wipe out of your Python environment. Further research revelead an open ticket on GitHub. Apparently they are working on a way to install the Python and Data Science workloads on top of an existing VS 2017 install, but I would not hold my breath on it:

Thanks everyone for the support and understanding. It’s still not clear to us how we’re going to be releasing Python support, but the plan is definitely to have something when VS 2017 releases next month.

Since the official VS 2017 release is planned early next month it is very likely that we will just have to wait until then. In the meantime, you better have a VS 2015 sitting side by side with your brand new, mutilated, Visual Studio 2017. Or you switch to Visual Studio Code, which offers fantastic support for Python.

Or you fallback to good ole trusted Vim, like I did.

join the newsletter to get an email alert when a new post surfaces on this site. if you want to get in touch, i am @nicolaiarocci on twitter.




Full Stack Python: The Full Stack Python Blog

Sat, 18 Feb 2017 05:00:00 +0000

Full Stack Python began way back in December 2012 when I started writing the initial deployment, server, operating system, web server and WSGI server pages. Since then, the pages have expanded out into a boatload of other areas including subjects outside the deployment topics I originally started the site to explain.

Frequently though I wanted to write a Python walkthrough that was not a good fit for the page format I use for each topic. Many of those walkthroughs became Twilio blog posts but not all of them were quite the right fit on there. I'll still be writing plenty more Twilio tutorials, but this Full Stack Python blog is the spot for technical posts that fall outside the Twilio domain.

Let me know what you think and what tutorials you'd like to see in the future. Hit me up on Twitter @fullstackpython or @mattmakai.