Categories
Python

Python: control structures

a = [0, 1, ‘hello’, ‘python’]
for i in a:
print i

for i, j in enumerate(a):
print i, j // Same as above, but prints the index

while i < 10:
i = i + 1

break and continue can be used to break out of the list

if i == 0:
print ‘zero’
elif i == 1:
print ‘one’
else:
print ‘other’

if i == 0 or (i == 1 and i + 1 == 2):

    print ‘0 or 1’

Try / Catch statements:

try:
a = 1 / 0
except Exception, e:
print ‘oops: %s’ % e
else:
print ‘no problem here’
finally:
print ‘done’

You can have more than one except statement (for different kinds of exceptions)

 

Categories
Python

Python: the barebone basics, and a bit more

lists (or arrays?)

a = [1,2,3]

a.insert(0) // insert at the beginning of the array

a.append(4) // append at the end

b = [5,6]

c = a + b // lists can be concatenated

print c[:2] // and sliced (prints [1,2]

print c[2:] // prints [3,4,5,6]

for i in c:

print i   // and iterated…

d = [x * 3 for x in a if x % 2 == 0] // Or mapped to create a new list (d = [6,12])

tuples

They are inmutable lists, useful to package objects:

a = (1,2, ‘hello’)

x,y,z = a // print x = 1, print y = 2, print z = ‘hello’

a = 1, // The () are optional, but if you have only one object, put a comma to indicate tuple

dict

pretty much hash tables:

a = {‘k’:’v’,’k2′:’v2′}

a[‘k’] // v

a.has_key(‘k2′) // True

a = dict(k=’v’, k2=3) // Alternative way to build dict

print a.values() // [‘v’,3]

print a.items() // [(‘k’,’v’),(‘k2’,3)] Note that the results are a list of truples

del a[‘k’] // delete a value

you can also delete values from arrays by indicating the index (a[1])

You can add values to the dictionary using update:

myDict.update(myOtherDic) // Same as Javascript extend. It merges both objects into myDict

Getting the first key:

dict[dict.keys()[0]]

Functions

def f(a, b):
return a + b // Basically functions have a definition (def) and a return

def f(a, b=2): // You can also define variables before you pass them as args

x, y = f(b=5, a=2) // You can have two returns!

return a + b, a – b

def f(*a, **b): // This one is a trip! The first arg, *a, stores all the single args, and **b stores all the pair value args. So:

return a, b

x, y = f(3, ‘hello’, c=4, test=’world’) // a = (3, ‘hello’) and b = {‘c’:4, ‘test’:’world’}

lambda

Sort of like an anonymous function in JavaScript, it is just a convenient way to not have to write a full fledged function for in-between operations.

map(lambda x: x + 2, a) // Where a = [1,4,7] and the result is [3,6,9]

Pretty printing objects in phyton

import pprint
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(your_object_here)

Categories
Encoding Python

On character encoding, and the dark arts surrounding it, and how it relates to Python, and javascript

1) Computers don’t manipulate characters, they manipulate bytes.

2) In order to assign meaning to bytes, we come up with conventions of what they mean. The simplest convention is called ASCII (95 combinations of bytes that represent 95 chars)

3) At 8 bytes, you could construct 256 different characters. At first, we only filled out the first 128 ones with english characters, and punctuation representations, and such other oddities. The other 128 possibilities were filled with some of the non-english languages odd characters. Problem is: we didn’t do it in such a conventional way, until the ISO conventions come. But that wasn’t enough, so there come Unicode, with 1.1M possibilities of characters, and so far we have only assigned 110K of them.

4) UTF-8: the problem with Unicode is that it is easier / more efficient to transfer data across the wire in 8 (rather than the 21 pieces Unicode characters require). So in order to do that, UTF-8 use some of those characters as we use the “shift” key in a computer: and now we can pack more things with less bandwidth. ASCII characters keep their same native byte representation, meaning they have exactly the same byte representation in UTF-8 and ASCII. 5) For python 2: you can have strings represented in two different data types. Strings (str) and unicode (unicode)

Exaple of string: my_string = “Hello”

Example of unicode: my_string = “Hello u2119u01b4”

5) In order to communicate back and forth between the two sets of codes, you do: my_string(‘ascii’) and my_string.encode(‘utf-8’), and .decode() as well.

6) Sometimes .encode() will fail, if you try to convert a UTF-8 sequence that doesn’t exist in ASCII, to ASCII, what you do think will happen? The second argument of .encode can tell you what to do in those cases, by default it is:

my_string.encode(‘ascii’,’strict’) which will throw an exception

But is can be: “replace” (the character that can’t be converted becomes ?), “ignore” (throw away the character), or “xmlcharrefreplace” (char becomes its xml entity equivalent)

7) Python 2 automatically converts strings back and forth, specially when you are trying to, for example concatenate strings of the two different groups. The best way to deal with this problem is to keep them separate and know you are dealing with two separate groups.

8) Python 3 does not convert automatically. It also based on Unicode instead. It also has a data type “byte” that store strings as bytes: b”hello” != “hello” (the second hello is unicode, therefore different)

9) If you try to concatenate strings of the different groups in python 3, it will just throw an error. So the stategy is: deal with different encoding at the very beginning of your input, and at the very end of your output. In other words:

– As data come in: encode it to be Unicode. as it go out, encode it to bytes.

– Inside your program: make sure you are always dealing with Unicode.

– If you use plugings, know what they are sending you (Unicode or bytes), and use the rules above. Keep in mind that just looking at a string of bytes won’t tell you what kind of encoding they are inn.