C Book Club for R Contributors

Session 1: Modern C Chapters 1-3

Heather Turner

May 16, 2023

Preface

Getting started

  • IDE
    • RStudio + Terminal
    • VS Code + C extension / VSCodium (FOSS version)
    • Emacs
    • Others: CLion ($), Vim, XCode (macOS) …, or text editor + terminal
  • Compiler, e.g. c99, gcc and clang
    • clang and gcc give better diagnostics
    • Installing XCode on macOS / RTools on Windows gives you clang / gcc
  • Modern C example code download has moved: https://inria.hal.science/hal-03345464

Level 0:
ENCOUNTER

Chapter 1: Getting Started

getting-started.c

/* This may look like nonsense, but really is -*- mode: C -*- */
#include <stdlib.h>
#include <stdio.h>

/* The main thing that this program does. */
int main(void) {
  // Declarations
  double A[5] = {
    [0] = 9.0,
    [1] = 2.9,
    [4] = 3.E+25,
    [3] = .00007,
  };

  // Doing some work
  for (size_t i = 0; i < 5; ++i) {
     printf("element %zu is %g, \tits square is %g\n",
            i,
            A[i],
            A[i]*A[i]);
  } 

  return EXIT_SUCCESS;
}

Takeaways

  • C is an imperative programming language.
    • The program is a set of orders to the computer to perform tasks
  • C is a compiled programming language
    • The compiler translates code info the binary code or executable
    • The target code is platform-dependent
      • Running and testing a program on a single platform does not guarantee portability

Exs 3 and 5: Compile then execute

Find which compilation command works for you.

  • -Wall = warn all
  • -o: where to store compiler output
  • -lm tells it to add some standard mathematical functions if necessary
$ clang -Wall -lm -o getting-started getting-started.c
$ ./getting-started
element 0 is 9,         its square is 81
element 1 is 2.9,       its square is 8.41
element 2 is 0,         its square is 0
element 3 is 7e-05,     its square is 4.9e-09
element 4 is 3e+25,     its square is 9e+50

Exs 6: Fix a bad program

clang -Wall -o bad bad.c
bad.c:4:1: warning: return type of 'main' is not 'int' [-Wmain-return-type]
void main() {
^
bad.c:4:1: note: change return type to 'int'
void main() {
^~~~
int
bad.c:16:6: error: call to undeclared library function 'printf' with type 'int (const char *, ...)'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     printf("element %d is %g, \tits square is %g\n", /*@\label{printf-start-badly}*/
     ^
bad.c:16:6: note: include the header <stdio.h> or explicitly provide a declaration for 'printf'
bad.c:22:3: error: void function 'main' should not return a value [-Wreturn-type]
  return 0;                                           /*@\label{main-return-badly}*/
  ^      ~

Exs 6 (ctd): fix bad.c

/* This may look like nonsense, but really is -*- mode: C -*- */
#include <stdio.h>

/* The main thing that this program does. */
int main() {
  // Declarations
  int i;
  double A[5] = {
    9.0,
    2.9,
    3.E+25,
    00007
  };

  // Doing some work
  for (i = 0; i < 5; ++i) {
    
    printf("element %d is %g, \tits square is %g\n",
           i,
           A[i],
           A[i]*A[i]);                           
  }
  
  return 0;                                           
}

Exs 7: correct array definition

/* This may look like nonsense, but really is -*- mode: C -*- */
#include <stdio.h>

/* The main thing that this program does. */
int main() {
  // Declarations
  int i;
  double A[5] = {
    [0] = 9.0,
    [1] = 2.9,
    [4] = 3.E+25,
    [3] = .00007
  };

  // Doing some work
  for (i = 0; i < 5; ++i) {
    
    printf("element %d is %g, \tits square is %g\n",
           i,
           A[i],
           A[i]*A[i]);                           
  }
  
  return 0;                                           
}

Compilation

  • A C program should compile cleanly without warnings.
  • Use -Werror to not create an executable even when there are just warnings

Chapter 2: The principal structure of a program

Grammar (1)

  • special words: #include, int, void, double, for, return
    • cannot be changed
  • brackets: {}, (), [], /**/, and <>
  • comma: separate arguments or list elements
    • optionally after last element in list
  • semicolon: terminates a statement inside {}
  • comments: /* */ multiline, // one line
  • literals: fixed values that are part of the program
    • use double quotes for string literals

Grammar (2)

  • identifiers: names that we give to things
    • data objects/variables e.g. A, i
    • type aliases, e.g. size_t, specify sort of object
      • _t reminds us the identifier is a type
  • functions e.g . main
    • main must be present, it is the starting point of execution
  • constants e.g. EXIT_SUCCESS
  • operators, like R except:
    • = for initialization and assignment
    • ++ to increment

Exs 9: brackets

  • <> only for specifying headers to include
  • {} embracing body of function, embracing elements of array (vector), embracing multi-line conditional statement
  • [] specifying length of array, specifying elements of array, indexing
/* This may look like nonsense, but really is -*- mode: C -*- */
#include <stdlib.h>
#include <stdio.h>

/* The main thing that this program does. */
int main(void) {
  // Declarations
  double A[5] = {
    [0] = 9.0,
    [1] = 2.9,
    [4] = 3.E+25,
    [3] = .00007,
  };

  // Doing some work
  for (size_t i = 0; i < 5; ++i) {
     printf("element %zu is %g, \tits square is %g\n",
            i,
            A[i],
            A[i]*A[i]);
  } 

  return EXIT_SUCCESS;
}

Declarations

All identifiers have to be declared (maybe not directly in our program)

  • int main(void)

    • a function of type int
  • double A[5]

    • an array of items of type double, of length 5
  • size_t i

    • i is of type size_t
  • printf declared in stdio.h

  • size_t and EXIT_SUCCESS come from stdlib.h

    int printf(char const format[static 1], ...); 
    typedef unsigned long size_t;
    #define EXIT_SUCCESS 0

To look up something declared in a header file

apropos printf #loads of stuff
man printf #only printf R-like help file
man 3 printf #other related functions e.g. fprintf

Identifiers

  • Declaration limited to scope
    • block scope: limited within enclosing {}, e.g. for loop
    • file scope: not within {}, so limited by file -> globals
  • Definition: specifies object itself
    • simplification: always initialize variables
      • augment declaration with initial value
      • use [] to designate specific elements (e.g. in array)
      • missing elements default to 0
      • indices from zero: think of as distance from start of array
    • can have many declarations, but need exactly one definition

Statements (1)

Instructions telling compiler what to do with declared identifiers

  • Iteration, e.g. for
    • loop body between {}
    • arguments separated by semicolons
      1. declaration, definition and initialization of loop variable
      2. loop condition
      3. statement executed after each iteration: ++i
    • don’t define loop variable outside () or reuse for several loops
  • Function calls
    • C uses “call by value”: values of identifiers cannot be changed by fn

Statements (2)

  • Function return
    • last statement in function
    • must return value of declared type
    • brackets not required around return value

(vs “call be reference” which allows value of variable to be changed)

Level 1:
ACQUAINTANCE

Buckle Up

Details about how objects are defined. Some main points

  • focus on unsigned integers
  • gradually introduce pointers
  • use arrays where possible

Indentation Style

Recommends “One True Brace Style”

int main (int argc , char * argv [ argc +1]) {
  puts (" Hello   world !");
  if ( argc > 1) {
    while ( true ) {
      puts (" some   programs   never   stop ");
    }
  } else {
    do {
      puts ("but   this  one   does ");
    } while ( false );
  }
  return EXIT_SUCCESS ;
}

There is no standard in base R (see R Dev Guide #133) but K&R is used more and just a minor variant (as above, but first { on new line).

Chapter 3. Everything is about control

If else blocks

As in R

  • 0 represents logical false

  • Any value different from 0 represents logical true.

  • Don’t compare to 0, false, or true as redundant

Exs 1: Add in for loop

/* This may look like nonsense, but really is -*- mode: C -*- */
#include <stdlib.h>
#include <stdio.h>
  
/* The main thing that this program does. */
int main(void) {

  // Declarations
  double A[5] = {
    [0] = 9.0,
    [1] = 2.9,
    [4] = 3.E+25,
    [3] = .00007,
  };

  // Doing some work
  for (size_t i = 0; i < 5; ++i) {
    if (i) {
      printf("element %zu is %g, \tits square is %g\n",
             i,
             A[i],
             A[i]*A[i]);   
    }
  }
  
  return EXIT_SUCCESS;
}

Exs 1: Add in for loop - result

$ ./getting-started-if
element 1 is 2.9,       its square is 8.41
element 2 is 0,         its square is 0
element 3 is 7e-05,     its square is 4.9e-09
element 4 is 3e+25,     its square is 9e+50

for loops

We can have multiple loops iterating on i as long as scopes don’t overlap. [?]

[Exs 2] Try to imagine what happens when i has value 0 and is decremented by means of the operator --

  • Becomes null?

while and do

while works as in R

do is similar to while but the condition after the dependent block:

do { // Iterates
  x *= (2.0 - a*x); // Heron approximation
} while ( fabs (1.0 - a*x) >= eps ); // Iterates until close

=> if condition false, will iterate once before closing.

Using {} recommended. Note ; after while statement after do {}.

break and continue

break works as in R

continue is equivalent to next in R

for (;;) is equivalent to while(true) (former appears 23 times; latter just once in R sources)

heron.c

#include <stdlib.h>
#include <stdio.h>

/* lower and upper iteration limits centered around 1.0 */
static double const eps1m01 = 1.0 - 0x1P-01;
static double const eps1p01 = 1.0 + 0x1P-01;
static double const eps1m24 = 1.0 - 0x1P-24;
static double const eps1p24 = 1.0 + 0x1P-24;

int main(int argc, char* argv[argc+1]) {
  for (int i = 1; i < argc; ++i) {        // process args
    double const a = strtod(argv[i], 0);  // arg -> double
    double x = 1.0;
    for (;;) {                    // by powers of 2
      double prod = a*x;
      if (prod < eps1m01) {
        x *= 2.0;
      } else if   (eps1p01 < prod) {
        x *= 0.5;
      } else {
        break;
      }
    }
    for (;;) {                    // Heron approximation
      double prod = a*x;
      if ((prod < eps1m24) || (eps1p24 < prod)) {
        x *= (2.0 - prod);
      } else {
        break;
      }
    }
    printf("heron: a=%.5e,\tx=%.5e,\ta*x=%.12f\n",
           a, x, a*x);
  }
  return EXIT_SUCCESS;
}

Exs 4: Analyze heron.c using printf

Add condition print statement to loops, e.g.

if (i == 1) printf("i = %d, loop 1, x: %f\n", i, x);
$./heron 0.07 5 6E+23
i = 1, loop 1, x: 2.000000
i = 1, loop 1, x: 4.000000
i = 1, loop 1, x: 8.000000
i = 1, loop 1, x: 16.000000
i = 1, loop 2, x: 14.080000
i = 1, loop 2, x: 14.282752
i = 1, loop 2, x: 14.285714
heron: a=7.00000e-02,   x=1.42857e+01,  a*x=0.999999957002

Exs 5: Describe the use of the parameters argc and argv

int main(int argc, char* argv[argc+1]) {
  printf("argc: %d\n", argc);
  printf("argv[0]: %s\n", argv[0]);
...  
$./heron 0.07 5 6E+23
argc: 4
argv[0]: ./heron
argv[1]: 0.07
argv[2]: 5
argv[3]: 6E+23
argv[4]: (null)

main receives an array of pointers to char: the program name, argc-1 program arguments, and one null pointer that terminates the array.

Exs 6: Print and change eps1m01

Add condition print statement to loops, e.g.

  printf("eps1m01: %f\n", eps1m01);
$./heron 0.07 5 6E+23
eps1m01: 0.750000
i = 1, loop 1, x: 2.000000
i = 1, loop 1, x: 4.000000
i = 1, loop 1, x: 8.000000
i = 1, loop 1, x: 16.000000
i = 1, loop 2, x: 14.080000
i = 1, loop 2, x: 14.282752
i = 1, loop 2, x: 14.285714
heron: a=7.00000e-02,   x=1.42857e+01,  a*x=0.999999957002

Originally eps1m01: 0.5.

switch

In R, switch only returns output for one matching case.

In C, switch jumps to matching case and continues to end or break.

Exs 7: Leaving out break statement

With help from https://overiq.com/c-programming-101/the-switch-statement-in-c/ to get character input from scanf

#include <stdlib.h>
#include <stdio.h>

int main(void){
  char arg;
  
  printf("Enter arg: ");
  scanf(" %c", &arg);
  
  switch (arg) {
  case 'm': puts (" this  is a  magpie ");
    break ;
  case 'r': puts (" this  is a  raven ");
    break ;
  case 'j': puts (" this  is a  jay ");
   // break ;
  case 'c': puts (" this  is a  chough ");
    break ;
  default : puts (" this  is an  unknown   corvid ");
  }
  
  return EXIT_SUCCESS;
}

Exs 7: output

$ ./switch
Enter arg: r
 this  is a  raven 
$ ./switch
Enter arg: j
 this  is a  jay 
 this  is a  chough