Subscribe: Planet Python
http://www.planetpython.org/rss20.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
chapter  coconut  code  data science  data  deep learning  features  https  learning  new  python  release  wire protocol 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Planet Python

Planet Python



Planet Python - http://planetpython.org/



 



Philip Semanchuk: Analyzing the Anglo-Saxonicity of the Baby BNC

Thu, 22 Jun 2017 18:46:13 +0000

Summary This is a followup to an earlier post about using Python to measure the “Anglo-Saxonicity” of a text. I’ve used my code to analyze the Baby version of the British National Corpus, and I’ve found some interesting results. How to Measure Anglo-Saxonicity – With a Ruler or Yardstick? Introduction Thanks to a suggestion from Ben Sizer, I decided to analyze the British National Corpus. I started with the ‘baby’ corpus which, as you might imagine, is smaller than the full corpus. It’s described as a “100 million word snapshot of British English at the end of the twentieth century“. It categorizes text samples into four groups: academic, conversations, fiction, and news. Below are stack plots showing the percentage of Anglo-Saxon, non-Anglo-Saxon, and unknown words for each document in each of the four groups. The Y axis shows the percentage of words in each category. The numbers along the X axis identify individual documents within the group. I’ve deliberately given the charts non-specific names of Group A, B, C, and D so that we can play a game. :-) Before we get to the game, here’s the averages for each group in table form. (The numbers might not add exactly to 100% due to rounding.) Anglo-Saxon (%) Non-Anglo-Saxon (%) Unknown (%) Group A 67.0 17.7 15.3 Group B 56.1 25.8 18.1 Group C 72.9 13.2 13.9 Group D 58.6 22.0 19.3 Keep in mind that “unknown” words represent shortcomings in my database more than anything else. The Game The Baby BNC is organized into groups of academic, conversations, fiction, and news. Groups A, B, C, and D each represent one of those groups. Which do you think is which? Click below to reveal the answer to the game and a discussion of the results. Show The Answers Answers Anglo-Saxon (%) Non-Anglo-Saxon (%) Unknown (%) A = Fiction 67.0 17.7 15.3 B = Academic 56.1 25.8 18.1 C = Conversations 72.9 13.2 13.9 D = News 58.6 22.0 19.3 Discussion With the hubris that only 20/20 hindsight can provide, I’ll say that I don’t find these numbers terribly surprising. Conversations have the highest proportion of Anglo-Saxon (72.9%) and the lowest of non-Anglo-Saxon (13.2%). Conversations are apt to use common words, and the 100 most common words in English are about 95% Anglo-Saxon. The relatively fast pace of conversation doesn’t encourage speakers to pause to search for those uncommon words lest they bore their listener or lose their chance to speak. I think the key here is not the fact that conversations are spoken, but that they’re impromptu. (Impromptu if you’re feeling French, off-the-cuff if you’re more Middle-English-y, or extemporaneous if you want to go full bore Latin.) Academic writing is on the opposite end of the statistics, with the lowest portion of Anglo-Saxon words (56.1%) and the highest non-Anglo-Saxon (25.8%). Academic writing tends to be more ambitious and precise. Stylistically, it doesn’t shy away from more esoteric words because its audience is, by definition, well-educated. It doesn’t need to stick to the common core of English to get its point across. In addition, those who shaped academia were the educated members of society, and for many centuries education was tied to the church or limited to the gentry, and both spoke a lot of Latin and French. That has probably influenced even the modern day culture of academic writing. Two of the academic samples managed to use fewer than half Anglo-Saxon words. They are a sample from Colliding Plane Waves in General Relativity (a subject Anglo-Saxons spent little time discussing, I’ll wager) and a sample from The Lancet, the British medical journal (49% and 47% Anglo-Saxon, respectively). It’s worth noting that these samples also displayed highest and 5th highest percentage of words of unknown etymology (26% and 21%, respectively) of the 30 samples in this category. A higher proportion of unknowns depresses the results in the other two categories. Fiction rest[...]



Django Weblog: DjangoCon US Schedule Is Live

Thu, 22 Jun 2017 17:46:57 +0000

We are less than two months away from DjangoCon US in Spokane, WA, and we are pleased to announce that our schedule is live! We received an amazing number of excellent proposals, and the reviewers and program team had a difficult job choosing the final talks. We think you will love them. Thank you to everyone who submitted a proposal or helped to review them.

Tickets for the conference are still on sale! Check out our website for more information on which ticket type to select. We have also announced our tutorials. They are $150 each, and may be purchased at the same place as the conference tickets.

DjangoCon US will be held August 13-18 at the gorgeous Hotel RL in downtown Spokane. Our hotel block rate expires July 11, so reserve your room today!

(image) (image) (image) (image)



Mike Driscoll: Book Review: Software Architecture with Python

Thu, 22 Jun 2017 17:15:51 +0000

Packt Publishing approached me about being a technical reviewer for the book, Software Architecture with Python by Anand Balachandran Pillai. It sounded pretty interesting so I ended up doing the review for Packt. They ended up releasing the book in April 2017. Quick Review Why I picked it up: Packt Publishing asked me to do a technical review of the book Why I finished it: Frankly because this was a well written book covering a broad range of topics I’d give it to: Someone who is learning how to put together a large Python based project or application Book Formats You can get this as a physical soft cover, Kindle on Amazon or various other eBook formats via Packt Publishing’s website. Book Contents This book has 10 chapters and is 556 pages long. Full Review The focus of this book is to educate the reader on how they might design and architect a highly scalable, robust application in Python. The first chapter starts off by going over the author’s ideas on the “principles of software architecture” and what they are. This chapter has no code examples whatsoever. It is basically all theory and basically sets up what the rest of the book will be covering. Chapter two is all about writing readable code that is easy to modify. It teaches some techniques for writing readable code and touches on recommendations regarding documentation, PEP8, refactoring, etc. It also teaches the fundamentals of writing modifiable code. Some of the techniques demonstrated in this chapter include abstracting common services, using inheritance, and late binding. It also discusses the topic of code smells. Chapter three clocks in at almost 50 pages and is focused on making testable code. While you can’t really teach testing in just one chapter, it does talk about such things as unit testing, using nose2 and py.test, code coverage, mocking and doctests. There is also a section on test driven development. In chapter four, we learn about getting good performance from our code. This chapter is about timing code, code profiling and high performance containers in Python. It covers quite a few modules / packages, such as cProfile, line profiler, memory profiler, objgraph and Pympler. For chapter five, we dig into the topic of writing applications that can scale. This chapter has some good examples and talks about the differences between concurrency, parallelism, multithreading vs multiprocessing and Python’s new asyncio module. It also discusses the Global Interpreter Lock (GIL) and how it effects Python’s performance in certain situations. Finally the reader will learn about scaling for the web and using queues, such as Celery. If you happen to be interested in security in Python, then chapter 6 is for you. It covers various types of security vulnerabilities in software in general and then talks about what the author sees as security problems in Python itself. It also discusses various coding strategies that help the developer write secure code. Chapter seven delves in to the subject of design patterns and is over 70 pages long. You will learn about such things as the singleton, factory, prototype, adapter, facade, proxy, iterator, observer and state patterns. This chapter does a nice job of giving an overview of design patterns, but I think a book that focuses a chapter per design pattern would be really interesting and really help drive the point home. Moving on, we get to chapter 8 which talks about “architectural patterns”. Here we learn about Model View Controller (MVC), which is pretty popular in the web programming sphere. We also learn a bit about event driven programming using twisted, eventlet, greenlet and Gevent. I usually think of a user interface using something like PyQt or wxPython when I think of event driven programming, but either way the concepts are the same. There is also a section on microservices in this chapter. Chapter nine’s focus is on deploying your Python applications. Here you will learn about usi[...]



EuroPython: EuroPython 2017: Call for on-site volunteers

Thu, 22 Jun 2017 15:29:28 +0000

Would you like to be more than a participant and contribute to make this 2017 edition of EuroPython a smooth success? Help us!

We have a few tasks that are open for attendees who would like to volunteer: fancy helping at the registration desk? Willing to chair a session? Find out how you can contribute and which task you can commit to. 

(image)

What kind of qualifications do you need?

English is a requirement. More languages are an advantage. Check our webpage or write us for any further information. 

The conference ticket is a requirement. We cannot give you a free ticket, but we would like to thank you with one of our volunteer perks.

How do you sign up?

You can sign up for activities on our EuroPython 2017 Volunteer App.

We really appreciate your help!

Enjoy,

EuroPython 2017 Team
EuroPython Society
EuroPython 2017 Conference




PyCharm: PyCharm Edu 4 EAP: Integration with Stepik for Educators

Thu, 22 Jun 2017 14:52:36 +0000

PyCharm Educational Edition rolls out an Early Access Program update – download PyCharm Edu 4 EAP2 (build 172.3049). Integration with Stepik for Educators In 2016 we partnered with Stepik, a learning management and MOOC platform, to announce the Adaptive Python course. But if you want to create your own Python course with the help of PyCharm Edu, integration with Stepik may help you easily keep up your learning materials and share them with your students. Let’s take a simple example based on the Creating a Course with Subtasks tutorial and look at the integration features in more detail. Uploading a New Course Assume you’ve created a new course, added some lessons and checked the tasks: Now you want to test the new course and share it with your students. Using Stepik as course platform is a great choice, thanks to integration with PyCharm Edu. First, you’ll need to create an account and log in: Going back to PyCharm Edu, you can now see a special Stepik icon in the Status Bar: Use the link Log in to Stepik to be redirected to Stepik.org and authorize PyCharm Edu: The Stepik Status Bar icon will be enabled after you authorize the course: Now you can upload the course to Stepik: Updating a Course Once a course is created and uploaded to Stepik, you can always add or change lessons or add subtasks to it, as we do in our example: The whole course, a lesson or just a single task can be updated any time you want to save you changes on Stepik: Sharing a Course with Learners Stepik allows educators to manage their courses: you can make your course visible to everyone, or invite your students privately (students need to have a Stepik account): Learners that have been invited to join the course can go to PyCharm Edu Welcome Screen | Browse Courses and log in to Stepik with a special link: The course is now available in the list: There you go. Let us know how you like this workflow! Share your feedback here in the comments or report your findings on YouTrack, to help us improve PyCharm Edu. To get all EAP builds as soon as we publish them, set your update channel to EAP (go to Help | Check for Updates, click the ‘Updates’ link, and then select ‘Early Access Program’ in the drop-down). To keep all your JetBrains tools updated, try JetBrains Toolbox App! — Your PyCharm Edu Team [...]



Python Anywhere: The PythonAnywhere API: beta now available for all users

Thu, 22 Jun 2017 12:12:36 +0000

We've been slowly developing an API for PythonAnywhere, and we've now enabled it so that all users can try it out if they wish. Head over to your accounts page and find the "API Token" tab to get started.

The API is still very much in beta, and it's by no means complete! We've started out with a few endpoints that we thought we ourselves would find useful, and some that we needed internally.

(image) Yes, I have revoked that token :)

Probably the most interesting functionality that it exposes at the moment is the ability to create, modify, and reload a webapp remotely, but there's a few other things in there as well, like file and console sharing. You can find a full list of endpoints here: http://help.pythonanywhere.com/pages/API

We're keen to hear your thoughts and suggestions!




A. Jesse Jiryu Davis: New Driver Features for MongoDB 3.6

Thu, 22 Jun 2017 08:12:59 +0000

At MongoDB World this week, we announced the big features in our upcoming 3.6 release. I’m a driver developer, so I’ll share the details about the driver improvements that are coming in this version. I’ll cover six new features—the first two are performance improvements with no driver API changes. The next three are related to a new idea, “MongoDB sessions”, and for dessert we’ll have the new Notification API. Wire Protocol Compression OP_MSG Sessions Retryable Writes Causally Consistent Reads Notification API Wire Protocol Compression Since 3.4, MongoDB has used wire protocol compression for traffic between servers. This is especially important for secondaries streaming the primary’s oplog: we found that oplog data can be compressed 20x, allowing secondaries to replicate four times faster in certain scenarios. The server uses the Snappy algorithm, which is a good tradeoff between speed and compression. In 3.6 we want drivers to compress their conversations with the server, too. Some drivers can implement Snappy, but since zLib is more widely available we’ve added it as an alternative. When the driver and server connect they negotiate a shared compression format. (If you’re a security wonk and you know about CRIME, rest easy: we never compress messages that include the username or password hash.) In the past, we’ve seen that the network is the bottleneck for some applications running on bandwidth-constrained machines, such as small EC2 instances or machines talking to very distant MongoDB servers. Compressing traffic between the client and server removes that bottleneck. OP_MSG This is feature #2. We’re introducing a new wire protocol message called OP_MSG. This will be a modern, high-performance replacement for all our messy old wire protocol. To explain our motive, let’s review the history. Ye Olde Wire Protocol First, we had three kinds of write messages, all unacknowledged, and also a message for disposing a cursor: Ye Olde Wire Protocol There were also two kinds of messages that expected a reply from the server: one to create a cursor with a query, and another to get more results from the cursor: Ye Olde Wire Protocol Soon we added another kind of message: commands. We reused OP_QUERY, defining a command as a query on the fake $cmd collection. We also realized that our users wanted acknowledged writes, which we implemented as a write message immediately followed by a getLastError command. That brings us to this picture of Ye Olde Wire Protocol: Ye Olde Wire Protocol This protocol served us remarkably well for years: it implemented the features we wanted and it was quite fast. But its messiness made our lives hard when we wanted to innovate. We couldn’t add all the features we wanted to the wire protocol, so long as we were stuck with these old message types. Middle Wire Protocol In MongoDB 2.6 through 3.2 we unified all the message types. Now we have the Middle Wire Protocol, in which everything is a command: Middle Wire Protocol This is the wire protocol we use now. It’s uniform and flexible and has allowed us to rapidly add features, but it has some disadvantages: Still uses silly OP_QUERY on $cmd collection No unacknowledged writes Less efficient Why is it less efficient? Let’s see how a bulk insert is formatted in the Middle Wire Protocol: Middle Wire Protocol The “insert” command has a standard message header, followed by the command body as a single BSON document. In order to include a batch of documents in the command [...]



eGenix.com: Python Meeting Düsseldorf - 2017-06-28

Thu, 22 Jun 2017 08:00:00 +0000

The following text is in German, since we're announcing a regional user group meeting in Düsseldorf, Germany. Ankündigung Das nächste Python Meeting Düsseldorf findet an folgendem Termin statt: 28.06.2017, 18:00 Uhr Raum 1, 2.OG im Bürgerhaus Stadtteilzentrum Bilk Düsseldorfer Arcaden, Bachstr. 145, 40217 Düsseldorf Neuigkeiten Bereits angemeldete Vorträge Matthias Endler         "Grumpy - Python to Go source code transcompiler and runtime" Tom Engemann         "BeautifulSoup als Test framework für HTML" Jochen Wersdörfer         "Machine Learning: Kategorisierung von FAQs" Linus Deike         "Einführung in Machine Learning: Qualitätsprognose aus Sensordaten erstellen" Andreas Bresser         "Bilderkennung mit OpenCV" Philipp v.d. Bussche & Marc-Andre Lemburg         "Telegram Bot als Twitter Interface: TwitterBot" Weitere Vorträge können gerne noch angemeldet werden. Bei Interesse, bitte unter info@pyddf.de melden. Startzeit und Ort Wir treffen uns um 18:00 Uhr im Bürgerhaus in den Düsseldorfer Arcaden. Das Bürgerhaus teilt sich den Eingang mit dem Schwimmbad und befindet sich an der Seite der Tiefgarageneinfahrt der Düsseldorfer Arcaden. Über dem Eingang steht ein großes "Schwimm’ in Bilk" Logo. Hinter der Tür direkt links zu den zwei Aufzügen, dann in den 2. Stock hochfahren. Der Eingang zum Raum 1 liegt direkt links, wenn man aus dem Aufzug kommt. >>> Eingang in Google Street View Einleitung Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in Düsseldorf, die sich an Python Begeisterte aus der Region wendet.Einen guten Überblick über die Vorträge bietet unser PyDDF YouTube-Kanal, auf dem wir Videos der Vorträge nach den Meetings veröffentlichen. Veranstaltet wird das Meeting von der eGenix.com GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf: Programm Das Python Meeting Düsseldorf nutzt eine Mischung aus (Lightning) Talks und offener Diskussion.Vorträge können vorher angemeldet werden, oder auch spontan während des Treffens eingebracht werden. Ein Beamer mit XGA Auflösung steht zur Verfügung.(Lightning) Talk Anmeldung bitte formlos per EMail an info@pyddf.de Kostenbeteiligung Das Python Meeting Düsseldorf wird von Python Nutzern für Python Nutzer veranstaltet. Da Tagungsraum, Beamer, Internet und Getränke Kosten produzieren, bitten wir die Teilnehmer um einen Beitrag in Höhe von EUR 10,00 inkl. 19% Mwst. Schüler und Studenten zahlen EUR 5,00 inkl. 19% Mwst. Wir möchten alle Teilnehmer bitten, den Betrag in bar mitzubringen. Anmeldung Da wir nur für ca. 20 Personen Sitzplätze haben, möchten wir bitten, sich per EMail anzumelden. Damit wird keine Verpflichtung eingegangen. Es erleichtert uns allerdings die Planung. Meeting Anmeldung bitte formlos per EMail an info@pyddf.de Weitere Informationen Weitere Informationen finden Sie auf der Webseite des Meetings:               http://pyddf.de/ Viel Spaß ! Marc-Andre Lemburg, eGenix.com[...]



Continuum Analytics News: It’s Getting Hot, Hot, Hot: Four Industries Turning Up The Data Science Heat

Wed, 21 Jun 2017 17:56:34 +0000

Company Blog Wednesday, June 21, 2017 Christine Doig Sr. Data Scientist, Product Manager Summer 2017 has officially begun. As temperatures continue to rise, so does the use of data science across dozens of industries. In fact, IBM predicts the demand for data scientists will increase by 28 percent in just three short years, and our own survey recently revealed that 96 percent of company executives conclude data science is critical to business success. While it’s clear that health care providers, financial institutions and retail organizations are harnessing the growing power of data science, it’s time for more industries to turn up the data science heat. We take a peek below at some of the up and comers.    Aviation and Aerospace As data science continues to reach for the sky, it’s only fitting that the aviation industry is also on track to leverage this revolutionary technology. Airlines and passengers generate an abundance of data everyday, but are not currently harnessing the full potential of this information. Through advanced analytics and artificial intelligence driven by data science, fuel consumption, flight routes and air congestion could be optimized to improve the overall flight experience. What’s more, technology fueled by data science could help aviation proactively avoid some of the delays and inefficiencies that burden both staff and passengers—airlines just need to take a chance and fly with it!  The Data Science Revolution Transforming #Aviation via @forbes http://owy.mn/2rnTndz — Olive Wyman (@oliverwyman) June 17, 2017 Cybersecurity    In addition to aviation, cybersecurity has become an increasingly hot topic during the past few years. The global cost of handling cyberattacks is expected to rise from $400 billion in 2015 to $2.1 trillion by 2019, but implementing technology driven by data science can help secure business data and reduce these attacks. By focusing on the abnormalities, using all available data and automating whenever possible, companies will have a better chance at standing up to threatening attacks. Not to mention, artificial intelligence software is already being used to defend cyber infrastructure.    Construction While improving data security is essential, the construction industry is another space that should take advantage of data science tools to improve business outcomes. As an industry that has long resisted change, some companies are now turning to data science technology to manage large teams, improve efficiency in the building process and reduce project delivery time, ultimately increasing profit margins. By embracing data analytics and these new technologies, the construction industry will also have more room to successfully innovate.   Ecology From aviation to cybersecurity to construction, it’s clear that product-focused industries are on track to leverage data science. But what about the more natural side of things? One example suggests ecologists can learn more about ocean ecosystems through the use of technology driven by data science. Through coding and the use of other data science tools, these environmental scientists found they could conduct better, more effective oceanic research in significantly less time. Our hope is for other scientists to continue these methods and unearth more pivotal information about our planet.    So there you have it. Four industries who are beginning to harness the power of data science to help transform business processes, drive innovation and ultimately change the world. Who will the next four be?      [...]



PyCharm: PyCharm 2017.2 EAP 4

Wed, 21 Jun 2017 17:00:20 +0000

The fourth early access program (EAP) version of PyCharm 2017.2 is now available! Go to our website to download it now.

New in this version:

  • Docker Compose support on Windows. If you’re using Docker on Windows, please try it out. To set up a Compose project, go to your project interpreter, Add Remote, choose the ‘Docker Compose’ interpreter type and get started. If you need some help getting started, read our blog post on Docker Compose for Linux and Mac and ignore the warnings about Windows.
  • Support for Azure databases, and Amazon Redshift. To try this out, open the database pane, use the green ‘+’ to add a data source, and choose the one you’re interested in.
  • We’ve also fixed a lot of bugs related to Django support, Jupyter notebooks, and code inspections.
  • And more! See the release notes for details

Please let us know how you like it! Users who actively report about their experiences with the EAP can win prizes in our EAP competition. To participate: just report your findings on YouTrack, and help us improve PyCharm.

To get all EAP builds as soon as we publish them, set your update channel to EAP (go to Help | Check for Updates, click the ‘Updates’ link, and then select ‘Early Access Program’ in the dropdown). If you’d like to keep all your JetBrains tools updates, try JetBrains Toolbox!

-PyCharm Team
The Drive to Develop

(image)



Enthought: Enthought Announces Canopy 2.1: A Major Milestone Release for the Python Analysis Environment and Package Distribution

Wed, 21 Jun 2017 16:30:55 +0000

Python 3 and multi-environment support, new state of the art package dependency solver, and over 450 packages now available free for all users Enthought is pleased to announce the release of Canopy 2.1, a significant feature release that includes Python 3 and multi-environment support, a new state of the art package dependency solver, and access to over 450 pre-built and tested scientific and analytic Python packages completely free for all users. We highly recommend that all current Canopy users upgrade to this new release. Ready to dive in? Download Canopy 2.1 here. For those currently familiar with Canopy, in this blog we’ll review the major new features in this exciting milestone release, and for those of you looking for a tool to improve your workflow with Python, or perhaps new to Python from a language like MATLAB or R, we’ll take you through the key reasons that scientists, engineers, data scientists, and analysts use Canopy to enable their work in Python. First, let’s talk about the latest and greatest in Canopy 2.1! Support for Python 3 user environments: Canopy can now be installed with a Python 3.5 user environment. Users can benefit from all the Canopy features already available for Python 2.7 (syntax checking, debugging, etc.) in the new Python 3 environments. Python 3.6 is also available (and will be the standard Python 3 in Canopy 2.2). All 450+ Python 2 and Python 3 packages are now completely free for all users: Technical support, full installers with all packages for offline or shared installation, and the premium analysis environment features (graphical debugger and variable browser and Data Import Tool) remain subscriber-exclusive benefits. See subscription options here to take advantage of those benefits. Built in, state of the art dependency solver (EDM or Enthought Deployment Manager): the new EDM back end (which replaces the previous enpkg) provides additional features for robust package compatibility. EDM integrates a specialized dependency solver which automatically ensures you have a consistent package set after installation, removal, or upgrade of any packages. Environment bundles, which allow users to easily share environments directly with co-workers, or across various deployment solutions (such as the Enthought Deployment Server, continuous integration processes like Travis-CI and Appveyor, cloud solutions like AWS or Google Compute Engine, or deployment tools like Ansible or Docker). EDM environment bundles not only allow the user to replicate the set of installed dependencies but also support persistence for constraint modifiers, the list of manually installed packages, and the runtime version and implementation. Multi-environment support: with the addition of Python 3 environments and the new EDM back end, Canopy now also supports managing multiple Python environments from the user interface. You can easily switch between Python 2.7 and 3.5, or between multiple 2.7 or 3.5 environments. This is ideal especially for those migrating legacy code to Python 3, as it allows you to test as you transfer and also provides access to historical snapshots or libraries that aren’t yet available in Python 3. Why Canopy is the Python platform of choice for scientists and engineers Since 2001, Enthought has focused on making the scientific Python stack accessible and easy to use for both enterprises and individuals. For example, Enthought released the first scientific Python distribution in 2004, added robust and corporate support for NumPy on 64-bit Windows in 2011, and released Canopy 1.0 in 2013. Since then, with its MATLAB-like experience, Canopy has enabled countless engineers, scientists and analysts to perform sophisticated analysis, build models, and create cutting-edge data science algorithms. Canopy’s a[...]



DataCamp: New Course: Deep Learning in Python (first Keras 2.0 online course!)

Wed, 21 Jun 2017 14:10:33 +0000

Hello there! We have a special course released today: Deep Learning in Python by Dan Becker. This happens to be one of the first online interactive course providing instructions in keras 2.0, which now supports Tensorflow integration and this new API will be consistent in the coming years. So you've come to the right deep learning course.

About the course:

Artificial neural networks (ANN) are a biologically-inspired set of models that facilitate computers learning from observed data. Deep learning is a set of algorithms that use especially powerful neural networks. It is one of the hottest fields in data science, and most state-of-the-art results in robotics, image recognition and artificial intelligence (including the famous AlphaGo) use deep learning. In this course, you'll gain hands-on, practical knowledge of how to use neural networks and deep learning with Keras 2.0, the latest version of a cutting edge library for deep learning in Python.

Take me to Chapter 1!

Deep Learning in Python features interactive exercises that combine high-quality video, in-browser coding, and gamification for an engaging learning experience that will make you a master in deep learning in Python!

(image)

What you'll learn:

In the first chapter, you'll become familiar with the fundamental concepts and terminology used in deep learning, and understand why deep learning techniques are so powerful today. You'll build simple neural networks yourself and generate predictions with them. You can take this chapter here for free.

In chapter 2, you'll learn how to optimize the predictions generated by your neural networks. You'll do this using a method called backward propagation, which is one of the most important techniques in deep learning. Understanding how it works will give you a strong foundation to build from in the second half of the course.

In the third chapter, you'll use the keras library to build deep learning models for both regression as well as classification! You'll learn about the Specify-Compile-Fit workflow that you can use to make predictions and by the end of this chapter, you'll have all the tools necessary to build deep neural networks!

Finally, you'll learn how to optimize your deep learning models in keras. You'll learn how to validate your models, understand the concept of model capacity, and experiment with wider and deeper networks. Enjoy!

Dive into Deep Learning Today




EuroPython: EuroPython 2017: Conference App available

Wed, 21 Jun 2017 11:56:34 +0000

We are pleased to announce our very own mobile app for the EuroPython 2017 conference:

(image)

EuroPython 2017 Conference App

Engage with the conference and its attendees

The mobile app gives you access to the conference schedule (even offline), helps you in planing your conference experience (create your personal schedule) and provides a rich social engagement platform for all attendees.

You can create a profile within the app (or link this to your existing social accounts), share messages and photos, and easily reach out to other fellow attendees - all from within the app.

Vital for all EuroPython attendees

We will again use the conference app to keep you updated by sending updates of the schedule and inform you of important announcements via push notifications, so please consider downloading it.

Many useful features

Please see our EuroPython 2017 Conference App page for more details on features and guides on how to use them.


Don’t forget to get your EuroPython ticket

If you want to join the EuroPython fun, be sure to get your tickets as soon as possible, since ticket sales have picked up quite a bit after we announced the schedule.

Enjoy,

EuroPython 2017 Team
EuroPython Society
EuroPython 2017 Conference




Kushal Das: Updates on my Python community work: 16-17

Wed, 21 Jun 2017 10:56:00 +0000

Thank you, everyone, for re-electing me to the Python Software Foundation board 2017. The results of the vote came out on June 12th. This is my third term on the board, 2014, and 2016 were the last two terms. In 2015 I was out as random module decided to choose someone else :) Things I worked on last year I was planning to write this in April, but somehow my flow of writing blog posts was broken, and I never managed to do so. But, better late than never As I had written in wiki page for candidates, one of my major goal last was about building communities out of USA region. I warm welcome I have received in every upstream online community (and also in physical conferences), we should make sure that others should be able to have the same experience. As part of this work, I worked on three things: Started PyCon Pune, goal of the conference being upstream first Lead the Python track at FOSSASIA in Singapore Helping in the local PyLadies group (they are in the early stage) You can read about our experience in PyCon Pune here, I think we were successful in spreading the awareness about the bigger community which stands out there on the Internet throughout the world. All of the speakers pointed out how welcoming the community is, and how Python, the programming language binds us all. Let it be scientific computing or small embedded devices. We also managed to have a proper dev sprint for all the attendees, where people did their first ever upstream contribution. At FOSSASIA, we had many professionals attending the talks, and the kids were having their own workshops. There were various other Python talks in different tracks as well. Our local PyLadies Pune group still has many beginner Python programmers than working members. Though we have many working on Python on their job, but never worked with the community before. So, my primary work there was not only about providing technical guidance but also try to make sure that the group itself gets better visibility in the local companies. Anwesha writes about the group in much more details than me, so you should go to her blog to know about the group. I am also the co-chair of the grants working group. As part of this group, we review the grants proposals PSF receives. As the group members are distributed, generally we manage to get good input about these proposals. The number of grant proposals from every region has increased over the years, and I am sure we will see more events happening in the future. Along with Lorena Mesa, I also helped as the communication officer for the board. She took charge of the board blog posts, and I was working on the emails. I was finding it difficult to calculate the amounts, so wrote a small Python3 script which helps me to get total numbers for every months’ update. This also reminds me that I managed to attend all the board meetings (they are generally between 9:30 PM to 6:30 AM for me in India) expect the last one just a week before PyCon. Even though I was in Portland during that time, I was confused about the actual time of the event, and jet lag did not help either. I also helped our amazing GSoC org-admin team, Terri is putting countless hours to make sure that the Python community gets a great experience in this program. I am hoping to find good candidates in Outreachy too. Last year, the PSF had funds for the same but did not manage to find a good candidate. There were other conferences where I participated in different ways. Among them the Science Hack Day India was very special, working with so many kids, learning Python together in the MicroPython environment was a special moment. Watiting for this year’s event eagerly. I will write about my goals in the [...]



Codementor: Building an Hello World Application with Python/Django

Wed, 21 Jun 2017 10:11:00 +0000

Read this beginner-friendly post to get started with using Django to build web projects!



Python Bytes: #31 You should have a change log

Wed, 21 Jun 2017 08:00:00 +0000

Brian #1: TinyMongo

  • Like MongoDB, but built on top of TinyDB.
  • Even runs on a Raspberry Pi, according to Stephen

Michael #2: A dead simple Python data validation library

  • validus.isemail('someone@example.com')
    • Validation functions include:
  • isrgbcolor()
  • isphone()
  • isisbn()
  • isipv4()
  • isint()
  • isfloat()
  • isslug()
  • isuuid()
    • Requires Python 3.3+

Brian #3: PuDB

  • In episode 29, https://pythonbytes.fm/29, I talked about launching pdb from pytest failures.
  • @kidpixo pointed out that PuDB was a better debugger and can also be launched from pytest failures.
  • Starting pudb from pytest failed tests (from docs): pytest --pdbcls pudb.debugger:Debugger --pdb --capture=no
  • Using pytest-pudb plugin to do the same: pytest --pudb

Michael #4: Analyzing Django requirement files on GitHub

  • From the pyup.io guys
  • Django is the most popular Python web framework.
  • It is now almost 12 years old and is used on all kinds of different projects.
  • Django developers pin their requirements (64%): Pinned or freezed requirements (Django==1.8.12) make builds predictable and deterministic.
  • Django 1.8 is the most popular major release (24%)
    • A bit worrisome are the 1.9 (14%), 1.7 (13%) and 1.6 (13%) releases on the second, third and fourth place. All of them are no longer receiving security updates, 1.7 and 1.6 went EOL over 2 years ago.
  • Yikes: Only 2% of all Django projects are on a secure release
    • Among all projects, more than 60% use a Django release with one or more known security vulnerabilities. Only 2% are using a secure Django release.
    • On the remaining part of more than 30% it's unclear what exactly is going to be installed. That's because the Django release is either unpinned or has a range.

Brian #5: Changelogs




Talk Python to Me: #117 Functional Python with Coconut

Wed, 21 Jun 2017 08:00:00 +0000

One of the nice things about the Python language is it's at least 3 programming paradigms in one: There's the procedural style, object-oriented style, and functional style.

This week you'll meet Evan Hubinger who is taking Python's functional programming style and turning it to 11. We're talking about Coconut. A full functional programming language that is a proper superset of Python itself.

Show note: Sorry for the lower audio quality in my voice on this one. Looks like my primary mic had trouble and the fallback wasn't as good as it should be. Plus, I had mostly lost my voice from PyCon (PyCon!!! And other loud speaking).

Links from the show:




Nikola: Nikola v7.8.9 is out! (maintenance release)

Tue, 20 Jun 2017 19:00:15 +0000

On behalf of the Nikola team, I am pleased to announce the immediate availability of Nikola v7.8.9. This is a maintenance release for the v7 series.

Future releases in the v7 series are going to be small maintenance releases that include bugfixes only, as work on v8.0.0 is underway.

What is Nikola?

Nikola is a static site and blog generator, written in Python. It can use Mako and Jinja2 templates, and input in many popular markup formats, such as reStructuredText and Markdown — and can even turn Jupyter Notebooks into blog posts! It also supports image galleries, and is multilingual. Nikola is flexible, and page builds are extremely fast, courtesy of doit (which is rebuilding only what has been changed).

Find out more at the website: https://getnikola.com/

Downloads

Install using pip install Nikola or download tarballs on GitHub and PyPI.

Changes

  • Restore missing unminified assets
  • Make failures to get source commit hash non-fatal in github_deploy (Issue #2847)
  • Fix a bug in HTML meta parsing that crashed on tags without name (Issue #2835)
  • Fix math not showing up in some circumstances (Issue #2841)



Ian Ozsvald: Kaggle’s Quora Question Pairs Competition

Tue, 20 Jun 2017 14:14:20 +0000

Kaggle‘s Quora Question Pairs competition has just closed, I’m pleased to say that with 10 days effort I ranked in the top 39th percentile (rank 1346 of 3396 in the private leaderboard). Having just run and spoken at PyDataLondon 2017, taught ML in Romania and worked on several client projects I only freed up time right at the end of this competition. Despite joining at the end I had immense fun – this was my first ‘proper’ Kaggle competition. I figured a short retrospective here might be a useful reminder to myself in the future. Things that worked well: Use of github, Jupyter Notebooks, my research module template Python 3.6, scikit-learn, pandas RandomForests (some XGBoost but ultimately just RFs) Dask (great for using all cores when feature engineering with Pandas apply) Lots of text similarity measures, word2vec, some Part of Speech tagging Some light text clean-up (punctuation, whitespace, some mixed case normalisation) Spacy for PoS noun extraction, some NLTK Splitting feature generation and ML exploitation into different Notebooks Lots of visualisation of each distance measure by class (mainly matplotlib histograms on single features) Fully reproducible Notebooks with fixed seeds Debugging code to diagnose the most-wrong guesses from the model (pulling out features and the raw questions was often enough to get a feel for “what it missed” which lead to thoughts on new features that might help) Things that I didn’t get around to trying due to lack of time: PoS named entities in Spacy, my own entity recogniser GloVe, wordrank, fasttext Clustering around topics Text clean-up (synonyms, weights & measures normalisation) Use of external corpus (e.g. Stackoverflow) for TF-IDF counts Dask on EC2 Things that didn’t work so well: Fully reproducible Notebooks (great!) to generate features with no caching of no-need-to-rebuild-yet-again features, so I did a lot of recalculating features (which really hurt in the last 2 days) – possible solution below with named columns Notebooks are still a PITA for debugging, attaching a console with –existing works ok until things start to crash and then it gets sticky Running out of 32GB of RAM several times on my laptop and having a semi-broken system whilst trying to persist partial models to disk – I should have started with an AWS deployment earlier so I could easily turn on more cores+RAM as needed I barely checked the Kaggle forums (only reading the Notebooks concerning the negative resampling requirement) so I missed a whole pile of tricks shared by others, some I folded in on the last day but there’s a huge pile that I missed – I think I might have snuck into the top 20% of rankings if I’d have used this public information Calibrating RandomForests (I’m pretty convinced I did this correctly but it didn’t improve things, I’m not sure why) Dask definitely made parallelisation easier with only a few lines of overhead in a function beyond a normal call to apply. The caching, if using something like luigi, would add a lot of extra engineered overhead – not so useful in a rapidly iterating 10 day competition. I think next time I’ll try using version-named columns in my DataFrames. Rather than having e.g. “unigram_distance_raw_sentences” I might add “_v0”, if that calculation process is never updated then I can just use a pre-built version of the column. This is a poor-mans caching strategy. If any dependencies existed then I guess luigi/airflow would be[...]



Frank Wierzbicki: Jython 2.7.1 release candidate 3 released!

Tue, 20 Jun 2017 11:11:36 +0000

On behalf of the Jython development team, I'm pleased to announce that the third release candidate of Jython 2.7.1 is available! This is a bugfix release. Bug fixes include improvements in ssl and pip support.

Please see the NEWS file for detailed release notes. This release of Jython requires JDK 7 or above.

This release is being hosted at maven central. There are three main distributions. In order of popularity:
To see all of the files available including checksums, go to the maven query for org.python+Jython and navigate to the appropriate distribution and version.



PyCharm: Upgrade Your Testing with Behavior Driven Development

Tue, 20 Jun 2017 10:10:48 +0000

BDD? Why should I care? Back in the day, I used to write terrible code. I’m probably not the only one who started out writing terrible PHP scripts in high school that just got the job done. For me, the moment that I started to write better code was the moment that I discovered unit testing. Testing forced me to properly organize my code, and keep classes simple enough that testing them in isolation would be possible. Behavior Driven Development (BDD) testing has the potential to do the same at a program level rather than individual classes. A common problem when making software is that different people have different opinions on what the software should do, and these differences only become apparent when someone is disappointed in the end. This is why in large software projects it’s commonplace to spend quite a bit of effort in getting the requirements right. If you’re making a small personal project, BDD can help you by forcing you to write down what you want your program to do before you start programming. I speak from experience when I say this helps you to finish your projects. Furthermore, in contrast to regular unit testing you get separation of concerns between your test scenario, and your test code. Which programmer doesn’t become excited about separation of concerns? This one sure does The real value of BDD arises for those of you who do contract work for small businesses, and even those with small to medium-sized open source projects, wouldn’t it be useful to have a simple way to communicate exactly what the software should do to everyone involved? Okay, so how do I get better software? Behavior driven development is just that, development which is driven by the behavior you want from your code. Those of you who do agile probably know the “As a , I want , so that ” template; in BDD a similar template is proposed: “In order to , as a , I want ”. In this template the goal of your feature is emphasized, so let’s do some truth in advertising here: In order to show off PyCharm’s cool support for BDD As a Product Marketing Manager at JetBrains I want to make a reasonably complex example project that shows how BDD works This is still rather vague, so let’s come up with an actual example project: Feature: Simulate a basic car To show off how BDD works, let’s create a sample project which takes the  classic OO car example, and supercharges it.  The car should take into account: engine power, basic aerodynamics, rolling  resistance, grip, and brake force. To keep things somewhat simple, the engine will supply constant power (this is not realistic, as it results in infinite Torque at zero RPM) A key element of BDD is using examples to illustrate the features, so let’s write an example:   Scenario: The car should be able to brake The UK highway code says that worst case scenario we need to stop from 60mph (27 m/s) in 73m Given that the car is moving at 27 m/s When I brake at 100% force And 10 seconds pass Then I should have traveled less than 73 meters   By writing this example, it becomes clear that our code will need to be aware of time passing, keep track of the car’s speed, distance traveled, and the amount of brakes that is applied at a given point in time. If you have complex examples, or want to check a couple of similar examples, you can use ASCII tables in your feature file to do this. To keep this blog post to a reasonable length I won’t di[...]



S. Lott: NMEA Data Acquisition -- An IoT Exercise with Python

Tue, 20 Jun 2017 08:00:04 +0000

Here's the code: https://github.com/slott56/NMEA-Tools. This is Python code to do some Internet of Things (IoT) stuff. Oddly, even when things connected by a point-to-point serial interface, it's still often called IoT. Even though there's no "Internetworking."Some IoT projects have a common arc: exploration, modeling, filtering, and persistence. This is followed by the rework to revise the data models and expand the user stories. And then there's the rework conundrum. Stick with me to see just how hard rework can be.What's this about? First some background. Then I'll show some code.Part of the back story is here: http://www.itmaybeahack.com/TeamRedCruising/travel-2017-2018/that-leaky-hatch--chartplot.htmlIn the Internet of Things Boaty (IoT-B) there are devices called chart-plotters. They include GPS receivers, displays, and controls. And algorithms. Most important is the merging of GPS coordinates and an active display. You see where your boat is.Folks with GPS units in cars and on their phones have an idea core feature set of a chart plotter. But the value of a chart plotter on a boat is orders of magnitude above the value in a car.At sea, the hugeness and importance of the chartplotter is magnified. The surface of the a large body of water is (almost) trackless. Unless you're really familiar with it, it's just water, generally opaque. The depths can vary dramatically. A shoal too shallow for your boat can be more-or-less invisible and just ahead. Bang. You're aground (or worse, holed.)A chart -- and knowledge of your position on that chart -- is a very big deal. Once you sail out of sight of land, the chart plotter becomes a life-or-death necessity. While I can find the North American continent using only a compass, I'm not sure I could find the entrance to Chesapeake Bay without knowing my latitude. (Yes, I have a sextant. Would I trust my life to my sextant skills?)Modern equipment uses modern hardware and protocols. N2K (NMEA 2000), for example, is powered Ethernet connectivity that uses a simplified backbone with drops for the various devices. Because it's Ethernet, they're peers, and interconnection is simplified. See http://www.digitalboater.com for some background.The Interface IssueThe particularly gnarly problem with chart plotters is the lack of an easy-to-live-with interface.They're designed to be really super robust, turn-it-on-and-it-works products. Similar to a toaster, in many respects. Plug and play. No configuration required.This is a two-edged sword. No configuration required bleeds into no configuration possible.The Standard Horizon CP300i uses NT cards. Here's a reader device. Note the "No Longer Available" admonition. All of my important data is saved to the NT card. But. The card is useless except for removable media backup in case the unit dies.What's left? The NMEA-0183 interface wiring.NMEA Serial EIA-422The good news is that the NMEA wiring is carefully documented in the CP300i owner's manual. There are products like this NMEA-USB Adaptor. A few wire interconnections and we can -- at least in principle -- listen to this device.The NMEA standard was defined to allow numerous kinds of devices to work together. When it was adopted (in 1983), the idea was that a device would be a "talker" and other devices would be "listeners." The intent was to have a lot of point-to-point conversations: one talker many listeners.A digital depth meter or wind meter, for example, could talk all day, pushing out message traffic with depth o[...]



Curtis Miller: Walk-Forward Analysis Demonstration with backtrader

Tue, 20 Jun 2017 00:00:24 +0000

Here I demonstrate walk-forward analysis with backtrader and investigate overfitting with backtesting and optimizing strategies for stock data.(image)



Daniel Bader: Enriching Your Python Classes With Dunder (Magic, Special) Methods

Tue, 20 Jun 2017 00:00:00 +0000

Enriching Your Python Classes With Dunder (Magic, Special) Methods What Python’s “magic methods” are and how you would use them to make a simple account class more Pythonic. What Are Dunder Methods? In Python, special methods are a set of predefined methods you can use to enrich your classes. They are easy to recognize because they start and end with double underscores, for example __init__ or __str__. As it quickly became tiresome to say under-under-method-under-under Pythonistas adopted the term “dunder methods”, a short form of “double under.” These “dunders” or “special methods” in Python are also sometimes called “magic methods.” But using this terminology can make them seem more complicated than they really are—at the end of the day there’s nothing “magical” about them. You should treat these methods like a normal language feature. Dunder methods let you emulate the behavior of built-in types. For example, to get the length of a string you can call len('string'). But an empty class definition doesn’t support this behavior out of the box: class NoLenSupport: pass >>> obj = NoLenSupport() >>> len(obj) TypeError: "object of type 'NoLenSupport' has no len()" To fix this, you can add a __len__ dunder method to your class: class LenSupport: def __len__(self): return 42 >>> obj = LenSupport() >>> len(obj) 42 Another example is slicing. You can implement a __getitem__ method which allows you to use Python’s list slicing syntax: obj[start:stop]. Special Methods and the Python Data Model This elegant design is known as the Python data model and lets developers tap into rich language features like sequences, iteration, operator overloading, attribute access, etc. You can see Python’s data model as a powerful API you can interface with by implementing one or more dunder methods. If you want to write more Pythonic code, knowing how and when to use dunder methods is an important step. For a beginner this might be slightly overwhelming at first though. No worries, in this article I will guide you through the use of dunder methods using a simple Account class as an example. Enriching a Simple Account Class Throughout this article I will enrich a simple Python class with various dunder methods to unlock the following language features: Initialization of new objects Object representation Enable iteration Operator overloading (comparison) Operator overloading (addition) Method invocation Context manager support (with statement) You can find the final code example here. I’ve also put together a Jupyter notebook so you can more easily play with the examples. Object Initialization: __init__ Right upon starting my class I already need a special method. To construct account objects from the Account class I need a constructor which in Python is the __init__ dunder: class Account: """A simple account class""" def __init__(self, owner, amount=0): """ This is the constructor that lets us create objects from this class """ self.owner = owner self.amount = amount self._transactions = [] The constructor takes care of setting up the object. In this case it receives the owner name, an optional start amount and defines an internal transactions list to keep track of deposits and withdrawals. This allows us to create new accounts like this: >>> acc = Acco[...]



pythonwise: Who Touched the Code Last? (git)

Mon, 19 Jun 2017 19:20:58 +0000

Sometimes I'd like to know who to ask about a piece of code. I've developed a little Python script that shows the last people who touch a file/directory and the ones who touched it most.

Example output (on arrow project)
$ owners
Most: Wes McKinney (39.5%), Uwe L. Korn (15.3%), Kouhei Sutou (10.8%)
Last: Kengo Seki (31M), Uwe L. Korn (39M), Max Risuhin (23H)

So ask Wes or Uwe :)

Here's the code: