What Computers Are, Part 2

J. Greg Davidson On Computing

Read Part 1 First!

In Part 1 we learned about Switches, Bits, Bytes and Words.

We're going to want to find out how to build arbitrarily large and sophisticated Data Structures out of those small building blocks.

But first: We need to go down a level to discover the more primitive switches used to build bits and implement computer operations.

How Memory and Operations work physically!

In modern computers memory and operations are implemented by tiny physical devices called transistors.

In digital electronics, transistors are switches. But not the well-behaved kind we looked at in Part 1.
Transistors change their output almost immediately when their inputs change and they tend to consume power all the time.

Fortunately, it's possible to wire up a collection of transistors to not use much power until something needs to change and to hold values constant until that time.

Fast Memory

Logic Gates to Operations

A few transistors can be wired together to create logic gates which can compute simple operations on single bits
e.g. a NAND gate computes NOT AND:

Input 1 Input 2 (AND) Output (NAND)
false false false true
false true false true
true false false true
true true true false

Logic gates can be wired together to create more complex operations on groups of bits, e.g. words.

Operations work on Registers

At this lowest level, computing consists of

  1. Bits, Bytes and Words being transfered to Registers from Main Memory or Input Devices.
  2. Operations being carried out between Registers.
  3. Results being transferred to Main Memory or Output Devices.

Going Large Now: Data Structures

Data Structures are the heart of sophisticated computer programs. Arbitrarily large and complex data structures can be created from Bytes and Words using only four strategies, each of which builds on the earlier ones and the techniques introduced above.

  1. Store Data Objects Contiguously in Memory

    This is the strategy we applied with building Bits into Bytes and Bytes into Words. We can continue using this strategy to build larger objects.

    We generally distinguish between three kinds of contiguous objects:

    Arrays aka Vectors
    have elements which are all of the same size
    Structures aka Records
    have fields which may be of different sizes
    Dictionaries aka Hash Tables
    store Elements in an Array at Locations based on Key Values.
    • A Hash Function is used to map a Key Value to an integer Hash Code suitable for use as an Array Index.
      • You want the integers to be as random as possible,
      • with a range from 0 to about twice the number of expected items.
      • Creating a good Hash Function is a bit of an art!
  2. Graph Structures: Objects Connected by Pointers
    Pointer aka Address
    the location of an object in Main Memory
    • Think of Main Memory as a huge array of bytes numbered from 0.
    • All objects in Main Memory consist of one or more bytes.
    • An object's address is the address (number) of its first byte.
    • Array elements can contain Pointers to any Object in Memory.
    • Records can have fields which contain Pointers to other Objects.
    • Each Node aka Component of a Graph Structure
      • can be located anywhere in memory
      • can be shared, i.e. can be part of multiple Graph Structures!

A key property of each of these strategies is how they allow a program to find the component parts of each object.

Array
The location of any element is its index times the element size.
Structure
The program uses the known offset of each field from the beginning of the structure.
Hash
The program computes the hash from a key to find the desired element.
Graph
The program uses the stored pointer to find the desired element.

Because their storage is not contiguous, Graph Structures can grow and rearrange their Nodes without having to move or copy any other parts!

Most advanced Data Stuctures are various kinds of Graph Structures.

If you'd like some examples of these strategies,

Low-Level Assembly-Language Programming

The lowest level a human programmer can work at is the language defined by the Instruction Set Architecture aka ISA which defines a given family of CPUs.

In practice, low-level programmers write programs in an Assembly Language which is

  1. unique to a specific ISA
  2. a text notation more human-friendly than binary ISA instructions
  3. easily translated into the binary ISA instructions
  1. A low-level programmer writes their program in Assembly Language with a text editor.
  2. An Assembler program translates the assembly-language source program into a binary program file.
  3. When the binary program is loaded into the computer the hardware of the machine starts translating and executing the ISA instructions.

High performance CPUs don't directly execute ISA instructions!

An ISA is defined for a family of compatible Processors (CPUs) implemented by many CPU models which can all understand the same ISA instructions.

Each model of CPU within a Processor family has its own Micro-Architecture with its own hidden internal machine language.
It has the ability to translate the generic family ISA instructions into its own internal language.
It may also may understand (translate) some extra ISA instructions which only some models of the Processor family understand.

The advantages of this arrangement are that

Some disadvantages of this arrangement are that

For these reasons most programmers do not use Assembly Language much. Instead, most programmers write most of their code using High-Level Programming Languages.

Programming in High-Level Languages

An Assembly Language provides a text notation for a specific ISA's instructions. An fairly simple Assembler program translates an Assembly Language program into an equivalent binary program which a CPU can "understand", i.e. translate and execute.

A modern high-level programming language is designed for a human programmer's concepts of programming, not for any machine's architecture. A Compiler program can translate a program written in a high-level programming language into the binary language of a Processor family or of a particular model (or set of models) within a Processor family. Compilers use very sophisticated strategies to bridge the huge gap between the semantics of a High-Level Programming Language and a machine's ISA. The best compilers can produce more efficient code than all but the very best low-level programmers!

Which High-Level Language Should You Use?

Most programmers find it easiest to learn and be productive in Programming Languages which are similar to languages they already know. This is unfortunate because most of the languages which programmers learn first, e.g. in Schools, are crude.

Don't just take our word for it. Read Paul Graham's account of how he became rich and influential because of his choice of language: Beating the Averages!

If you want to do powerful and creative things with computers it is important to (1) use a low-level language just long enough to understand how to build powerful abstractions and then (2) graduate to using higher-level languages which build-in some of those key abstractions, supporting you in more easily writing much better programs.

Now, if you haven't read it yet,

We recommend reading our guide to Choosing Computer Languages.

Learn Many Programming Paradigms!

A Programming Paradigm is a fundamental way of approaching programming, of thinking of computation. Problems which are difficult to solve using one Paradigm can often be easily solved by applying a different Paradigm. Programming Languages can be seen as belonging to different families based on which Programming Paradigms they are best at exploiting. One of the best ways of learning diverse and powerful Programming Paradigms is by programming for awhile in elegant languages of diverse families!

Perhaps the real value of High-Level Programming Languages is that they make it easier to leverage powerful Programming Paradigms. The Programming Paradigms which are most useful for your needs should inform your choice of programming language.

Key reading material on Programming Paradigms

References for What Computers Are

Memory Hierarchy
Computers use several kinds of memory with very different characteristics
Composite Data Structures
Details with code in multiple High-Level Languages
GeeksForGeek's Data Structures
Find some examples you would like to use. How might you implement them?