Class Notes

Throughout this site
text in blue boxes
means a command for you to execute.
C-x
means, in Emacs, hold down the <Ctl> key and press x
M-x
means press the <Esc> key and press x or hold the <Alt> key and press the x key.
$ command preceded by a dollar sign
is a command to be typed in your shell-- DO NOT type the dollar sign.

To do your homework:

Open the homework file and type in your answers. You should have a 'homework' directory on your flash drive that contains the hw.tex file. That is the file to work on. Use the LaTeX notecard to help you figure out math symbols that are not obvious.

Remember that all math symbol must be between $ or \[ \]. For example, for math within a sentence:

$\frac{1}{2}$
or for math centered in the page:
\[
H(x)=-\sum_x p(x)\log p(x)
\]
 
The basic commands you will need are
        to  
C-x C-f      open file 				      
C-c C-t C-p  switch to PDF mode			      
C-~	     switch to math mode			      
`	     prefix for math symbols such as `a for \alpha 
C-x C-s      save					      
C-c C-c      continue to next: typeset-view, finish version
              control log comment, or collapse BibTeX entry			      
C-x vv        do next logical thing in version control 

To hand in your homework:

Even if you give up on trying to typeset the LaTeX file, do try to type up at least some problems. To turn in your work, do one of the following:
  1. Make a compressed archive of your homework using the common zip compression used on PCs.
    $ zip homework homework
    This will create a file homework.zip that you can email as an attachment.
    (If you would like to try bzip2 compression,
    $ cd f:/
    tar -jcvf homework.tbz homework
    
    This will create a file homework.tbz that you can email as an attachment.)
  2. Bring your flash drive by the office.
  3. If you cannot do the above, print out the PDF file or the source tex file, hw.tex, if you cannot get it to typeset.

Thursday June 5

Free Software

Huffman Coding in real life:
bzip compression software (note, a different kind of free software licence: BSD, rather than GPL) .

John Chambers

java Turing Machine with nice explanation.

Turing Machine in Elisp (you can run in Emacs)
Solomonoff Publications (see 1964)

Chaitin home (see "On the length of programs for computing finite binary sequences " especially part 2)

Source for this? "Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems Inform. Transmission. 1, 1-7."

Chaitin lectures (See the Mälardalen University lecture)

Paul Shields (see his 'tutorial' for nice applications to statistics, ready to use in proofs.)

Mark Hansen
Find MDL paper with Bin Yu: "M. H. Hansen and B. Yu (2001), Model Selection and the Principle of Minimum Description Length, Journal of the American Statistical Association, Vol. 96, No. 454, pp. 746-774"? Not on his Papers page. Also see video lecture by Rissanen. Also online MDL Demo.

Deb Nolan

Yang Review

Monday June 9

What we did in class today:
Launch Emacs, get a version controlled project, and send commands one at a time to R.
  1. Insert your flash drive.
  2. Windows may prompt you to choose an option.
    Open in explorer
  3. Double click on the runemacs shortcut.
  4. Open an emacs shell
    M-x eshell
    M-x is produced by pressing ESC and then pressing x
    or by holding down the ALT key while pressing x.
  5. We tried a couple of commands ls (list) shows what is in your directory.
    $ df -h 
    (disk free -human readable form) shows your drives- the 3.8 gig is your flash drive.)
  6. Move to the root level of your flash drive
    $ cd f: 
    (or whatever the letter of you drive is)
  7. Get a branch of your professor's version controlled class directory
    $ bzr branch http://lchen.org/~lchen/infotheory

    You only do this for the first time. In the future you will change to the class directory and to see if changes have been made or new files added use
    $ bzr update
  8. Change into the class directory
    cd infotheory
    and see what is there
    $ ls
  9. Open the exercise file:
    C-x C-f 

    Use tab completion: type part of the file name and then press tab.
    This helps you avoid typing errors and allows you to know where you are, in addition to saving time.
  10. Start R
    M-x R
    R will ask you what directory you want to start in.
    Do not start in the default- your flash drive. Rather, delete the default choice and use your home directory:
    $ ~
    Switch between screens and the minnibuffer (small area at bottom of screen where commands ask for input and echo results) with
    C-x o
    switch between buffers with
     C-x b
    and split or unsplit screen with
    C-x 2
    C-x 1
  11. Step through the commands one at a time:
    C-c C-n 
  12. When finished, quit R and close Emacs.
    C-c q
    answer n unless you created large objects in your R session and plan to come back to the same computer to keep working.
  13. Quit Emacs
    C-x C-c 
    Decide whether to save shell history and unsaved changes to your files.

Homework

Many of you started R in the default directory, which was probably somewhere on your flash drive. This file takes up precious space on your drive so it would be good to delete it. Here is how:
  1. Launch Emacs.
  2. M-x find-name-dired
  3. Choose the root of your flash drive.
  4. The name is something starting with .R or _R so enter for the patter to search for:
    .R*
    If that finds nothing, use
    _R*
  5. When you have found the name and location, use eshell to remove the file:
    M-x eshell
    cd (path to your file)
  6. remove the file using the rm command and either tab completion to finish the name or the 'wild card' notation used in the find function.

Thursday June 12

In this class, we will retrieve a version controlled copy of the homework and learn how to write up our answers in LaTeX, keeping the whole process in version control.

Distributed version control works by keeping a hidden copy of all your work and each incremental change in a repository. We will use our whole flash drive as a shared version control repository. That means, roughly, that if you have two copies of a version controlled directory, there is only one hidden version, instead of two, saving space (see the manual for a more precise description of 'shared repository').

We will use one copy as a master version of what Lihua wrote. Update inside this directory (bzr pull) to keep it current with Lihua's most recent changes.

We will make a local branch in which to do the homework: by writing in LaTeX right onto the homework handout (you can use multiple files if you prefer-- good practice).

Check in your changes using version control as you make progress on the problems.

Give credit if you consulted with a classmate on how to do a problem and note collaboration in your version control log. The final revision should be obviously original and you should be able to explain it orally.

The whole version controlled directory should be turned in, perhaps as a bzip2 archive or by Lihua branching from your directory.

  1. Open a shell and update your working directory. See the available commands first:
    $ bzr help
  2. Should we identify ourselves? Choose from above command list and use your email address as identifier. Not really necessary on UT computers.
  3. Let's use the homework directory as a clean copy, following section 6.2 of the bzr user's guide.
  4. Maybe we should set up a shared repository? We should be able to move the whole infotheory branch, or you can try
    $ rm -r infotheory
    and just branch another copy, as in Monday's class if you have not done anything you want to keep in that directory.
    To make a shared repository you could just
    $ cd f:/
    $ bzr init
    at your flash drive root. Advantages? Disadvantages? What if you want to move the whole repository? Should be ok with the rsync command and a switch to ignore the iPoIDE08 directory.
    $ rsync --exclude=iPoIDE08 f: ~/myhome/ 
  5. Let's pull to make a mirror:
    $ cd infotheory
    bzr pull
  6. Now branch: this is where you will work on your homework. (see the manual for URL formats)
    $ cd F:/ 
    $ bzr branch infotheory homework
    $ cd homework
  7. Have a look in the directory, and then open the hw.tex file.
    $ ls
    C-x C-f hw.tex
    
  8. This file needs some work. It is not good LaTeX. Have a look at the LaTeX tutorial and the writing a paper step by step example provided there to improve hw.tex by making the title a heading, and making the arrays real arrays, rather than pre-formatted text.

    Also, add a bibliography and include at least a citation for Cover's book (see Yang's book review at bottom of this page for full citation) from which the problems came.

  9. To view output press the preview button on the right hand side of your emacs toolbar.
  10. To make a pdf
    C-c C-t C-p
  11. Typeset with
     C-c C-c
    View with
     C-c C-c
    or
    M-x eshell
    SumatraPdf hw.pdf
  12. Be sure to commit changes (commit means make the changes you have done permanent for this version and record them with version control) as you go along. There are a number of keyboard shortcuts for important version control (bzr) commands built into emacs. To see a list try using Meta vc- and tab completion:
     M-x vc-<TAB> 
    When you issue a command with M-x, the minnibuffer will briefly prompt you with the key binding for that command, if it has one. Try to learn the key bindings for more common commands.

    One important command is

    C-x v v
    executed while reading a file that is under version control (see the status line), this will do whatever makes logical sense at that point: update, or commit. In a commit, it will automatically pop up a log window where you should always make a little log entry summarizing in a couple of words what you did. Expect that your log will be read ( you will turn in the whole directory in this class).
    C-c C-c
    closes the log window and finished the commit when you are finished.

Working together

Working together to solve problems can be very valuable, when allowed and appropriate. MIT has an interesting policy on this: If students work together, they must give credit to the person they worked with. They must submit their own writeup. If two write ups look like they were copied, the student suspected must give an oral explanation of all the work. If the student cannot, they are assumed to have cheated and punished accordingly.

Version control automatically facilitates something like this for developers where, as we discussed in the section on free software, property rights are very important. If you have a vc log showing what you did and when, by committing as you make changes, you can demonstrate who did what and prove your ownership of key ideas if challenged.

Survey

Please give us some feedback to help us understand the usefulness of this system as required by the grant that helped us purchase the drives. In particular, please try to answer these questions:

What software do you use and what are its advantages for:

  1. writing R code,
  2. running R,
  3. writing mathematics,
  4. if you use LaTeX for writing mathematics, what software do you use for writing LaTeX code,
  5. editing text (performing operations on data sets or other plain text, such as adding text, search and replace, move rectangular selections, etc.),
  6. using shell commands,
  7. version control,
  8. interacting with other math programs- Matlab, Maple, etc.
Any ideas on how to motivate students to try this system or to improve the system or the tutorial web pages are also welcome.

If you are graduating and are sure you will not use the flash drive system, please return it so other students can have a chance.