UnicodeDecodeError: Python Strings are not what you think !
by nitin chadha (speaking)
Objective
World is not in ASCII anymore !! This session aims to introduce Unicode, it's implementation in python, and why strings are not the same as they are in C / Java etc. It covers all things starting from evolution of strings ( characters ) to their storage and how they are handled by python ( which is not so-perfect ). At the end, I will make sure that you are able to decipher text from even Portuguese for your applications.
Description
As the internet grows, we have support for new languages and as the new languages on-board, complexity for handling them also increases. Today, Strings form the basis of many realms like web-development, NLP, machine-learning etc but how they are implemented, stored in memory and processed upon is often overlooked as people coming from different backgrounds have their own assumptions. But that does not hold true anymore.
This session introduces what, why and know-how of Unicode and its implementation by Python and talks about various ways you can avoid errors like UnicodeDecodeError etc. We will go in detail of how complex system of Unicode evolved, understand its various nuisances, why programmers often overlook these details and why your customers see tons of lines of �����������.
I will also explain how Python supports Unicodes, and how it messes it up! and also how upto varying degrees python's libraries and frameworks support Unicodes. and how python 3.0 improves on it.
After the session, you will be able to make out why emails from Japan are shown as � and what you need to do to avoid such things in your own websites or applications.
I will also touch on Internationalization and Localization.
Speaker bio
I am not a geek by nature, but have good understanding of technology and programming. I am naturally motivated to help someone begin or take first steps in programming, and as such am co-founder of Panjab Univ LUG (pulug@googlegroups.com), and have spoken at various conferences also ( including pycon-2011 ).
I love to understand and design systems and architectures and as such, I have worked with companies like Google, Amazon and presently, LinkedIn where i am working on challenges everyday ( and yes, i am loving it all ! )
In free time, I do not know what I do but mostly my time passes away planning my schedules, reading articles and running.
0
▼
Here is one of my fav talk about unicode http://nedbatchelder.com/text/unipain.html.
1
▼
link is not working Kracekumar Ramaraju.
1
▼
http://nedbatchelder.com/text/unipain.html There is no '.' at end. :)
It is a good talk.