Contents:
You are very unlikely to break anything, so just give it a go. After a while, it can get quite tiresome to keep retyping Python statements over and over again. We do this by saving results to a location in the computer's memory, and giving the location a name. Such a named place is called a variable. In Python we create variables by assignment , which involves putting a value into the variable:.
In line we have created a variable called msg short for 'message' and set it to have the string value 'Hello World'. Notice the Python interpreter does not print any output; it only prints output when the statement returns a value, and an assignment statement returns no value.
In line we inspect the contents of the variable by naming it on the command line: that is, we use the name msg. The interpreter prints out the contents of the variable in line.
The names we choose for the variables are up to us. Instead of msg and num , we could have used any names we like:. Thus, the reason for choosing meaningful variable names is to help you — and anyone who reads your code — to understand what it is meant to do. Here we have taken the value of msg , multiplied it by 3 and then stored that new string HiHiHi back into the variable msg. So far, when we have wanted to look at the contents of a variable or see the result of a calculation, we have just typed the variable name into the interpreter.
We can also see the contents of msg using print msg :. On close inspection, you will see that the quotation marks that indicate that Hello World is a string are missing in the second case. That is because inspecting a variable, by typing its name into the interactive interpreter, prints out the Python representation of a value.
In contrast, the print statement only prints out the value itself, which in this case is just the text contained in the string. In fact, you can use a sequence of comma-separated expressions in a print statement:. If you have created some variable v and want to find out about it, then type help v to read the help entry for this kind of object.
Type dir v to see a list of operations that are defined on the object. You need to be a little bit careful in your choice of names or identifiers for Python variables. Some of the things you might try will cause an error. First, you should start the name with a letter, optionally followed by digits 0 to 9 or letters. Thus, abc23 is fine, but 23abc will cause a syntax error. You can use underscores both within and at the start of the variable name , but not a hyphen, since this gets interpreted as an arithmetic operator.
A second problem is shown in the following snippet. Why is there an error here? Because not is reserved as one of Python's 30 odd keywords. These are special identifiers that are used in specific syntactic contexts, and cannot be used as variables. It is easy to tell which words are keywords if you use IDLE, since they are helpfully highlighted in orange. The Python interative interpreter performs your instructions as soon as you type them.
Often, it is better to compose a multi-line program using a text editor, then ask Python to run the whole program at once. Try this now, and enter the following one-line program:. Save this program in a file called test. The result in the main IDLE window should look like this:. Now, where is the output showing the value of msg?
The answer is that the program in test. So add another line to test.
Select Run Module again, and this time you should get output that looks like this:. From now on, you have a choice of using the interactive interpreter or a text editor to create your programs. It is often convenient to test your ideas using the interpreter, revising a line of code until it does what you expect, and consulting the interactive help facility.
Try the examples in section 1. Create a variable called msg and put a message of your own in this variable. Remember that strings need to be quoted, so you will need to type something like:. Now print the contents of this variable in two ways, first by simply typing the variable name and pressing enter, then by using the print command. Try various arithmetic expressions using this string, e. Strings are so important that we will spend some more time on them.
Here we will learn how to access the individual characters that make up a string, how to pull out arbitrary substrings , and how to reverse strings. The positions within a string are numbered, starting from zero. To access a position within a string, we specify the position inside square brackets:. This is called indexing or subscripting the string. The position we specify inside the square brackets is called the index. We can retrieve not only letters but any character, such as the space at index 5.
Be careful to distinguish between the string ' ' , which is a single whitespace character, and '' , which is the empty string. The fact that strings are indexed from zero may seem counter-intuitive. You might just want to think of indexes as giving you the position in a string immediately before a character, as indicated in Figure 1. The index of 11 is outside of the range of valid indices i. This results in an error message. This time it is not a syntax error; the program fragment is syntactically correct.
Instead, the error occurred while the program was running. The Traceback message indicates which line the error occurred on line 1 of "standard input".
It is followed by the name of the error, IndexError , and a brief explanation. In general, how do we know what we can index up to? If we know the length of the string is n , the highest valid index will be n We can get access to the length of the string using the built-in len function. Informally, a function is a named snippet of code that provides a service to our program when we call or execute it by name.
We call the len function by putting parentheses after the name and giving it the string msg we want to know the length of. We have seen what happens when the index is too large. What about when it is too small? Let's see what happens when we use values less than zero:. This does not generate an error. Instead, negative indices work from the end of the string, so -1 indexes the last character, which is 'd'.
Now the computer works out the location in memory relative to the string's address plus its length, subtracting the index, e. We can also visualize negative indices as shown in Figure 1.
Thus we have two ways to access the characters in a string, from the start or the end. In NLP we usually want to access more than one character at a time. This is also pretty simple; we just need to specify a start and end index. For example, the following code accesses the substring starting at index 1 , up to but not including index 4 :.
Calling defaultdict int creates a special kind of dictionary. Actions, Beliefs and Intentions in Multi-action Utterances. In English, the base form is conventionally used as the lemma for a word. They cannot occur with the word much i. You will learn by example, write real programs, and grasp the value of being able to test an idea through implementation. Let's now look at how to perform substitutions, using the re.
The notation :4 is known as a slice. Here we see the characters are 'e' , 'l' and 'l' which correspond to msg[1] , msg[2] and msg[3] , but not msg[4].
This is because a slice starts at the first index but finishes one before the end index. This is consistent with indexing: indexing also starts from zero and goes up to one before the length of the string. We can see this by slicing with the value of len :. We can also slice with negative indices — the same basic rule of starting from the start index and stopping one before the end index applies; here we stop before the space character:. Python provides two shortcuts for commonly used slice values. If the start index is 0 then you can leave it out, and if the end index is the length of the string then you can leave it out:.
The first example above selects the first three characters from the string, and the second example selects from the character with index 6, namely 'W' , to the end of the string. Write a Python statement that changes this to "colourless" using only the slice and concatenation operations. Then try some more of your own. Guess what the result will be before executing the command.