Wednesday, June 24, 2015

C Programming: Pitfalls while word counting

In Earlier post "Count number of characters, words and lines in input", we have seen how to count number of characters,words and lines. Today, we will learn about pitfalls during word counting.

In K&R, we have a separate exercise for this. Exercise 1.14 states "How would you test the word count program? What kinds of input are most likely to uncover bugs if there are any?"

While writing this program earlier, we never handled much cases. There are caveats which needs to be identified and addressed.

First lets list what all possible checks we usually need while word counting.

1. Checking for very short words.
2. Check for very lengthy words.
3. Check for words which separated when new line is encountered. For example kernel
trap. Where kernel is at end of a line and trap follows in next line.4. Considering words like "isn't", "tour's" as single words
5. Check overall files size for size less than 2GB.
6. Check for mistyped words like "kernel  -  trap" which contain spaces in middle or an - instead of space ex.kernel-trap.
7. Check of non ASCII characters
8. Check for different encoding

Please shed your thoughts if I have missing any checks.

Monday, June 1, 2015

C Program : Count number of characters, words and lines in input

Earlier, we have seen "C : Program to display tabs , backspaces visible in an unambiguous way", today we will write a small C program which counts number of characters ,  number of words and number of lines.

Program looks easy when they ask to count lines and numbers but how about words.

If input to stdin is character by character, then we can count characters easily. Also using '\n' , we can identify that line has encountered. Only problem is counting words.

So, we will use a mechanism to find if word has encountered. Lets introduce two states IN and OUT which states currently process word and the other says its out of word respectively.

 #include  
   
 #define IN 1  
 #define OUT 0  
   
 int main()  
 {  
   int c, nl, nw, nc, state;  
   
   state = OUT;  
   nl = nw = nc = 0;  
   
   while((c = getchar()) != EOF)  
   {  
     /* Increment number of characters */  
     ++nc;  
   
     /* Increment number of lines if end of line is encountered */  
     if( c == '\n' )  
     {  
       ++nl;  
     }  
     /* Anything other than character, mark it new word */  
     if( c == ' ' || c == '\n' || c == '\t' )  
     {  
       state = OUT;   /* Its just completed processing a word */
     } /* if new word, increment word count */  
     else if ( state == OUT )  
     {  
       state = IN;  
       ++nw;  
     }  
   }  
   printf(" Number of Characters = %d \n",nc);  
   printf(" Number of lines = %d \n",nl);  
   printf(" Number of Words = %d \n",nw);  
   return 0;  
 }  
   

Output of this program

 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
 This program counts  
 number of characters  
 number of words  
 number of lines  
  Number of Characters = 73   
  Number of lines = 4   
  Number of Words = 12   

In this program

  • We read input character by character 
  • First we increment the character (nc)
  • We then check if it is a new line. If new line, we increment the character
  • Now to increment word (collection of characters), we need to do multiple checks which signify end of word. 
  • If its new line or tab or space, we consider that as word and increment word
  • Finally when you press Ctrl+D (in Linux), the program will display the number of characters, number of lines and number of words.
Hope this helped :)


Monday, May 25, 2015

C : Program to display tabs , backspaces visible in an unambiguous way

In my earlier post, we have seen "C : Program to replace multiples spaces with a single space" , today lets modify the same code to display tabs, backspaces.

Below is the program which does this job.
 #include   
 int main()  
 {  
   char c;  
   printf("Enter a line to display tabs \n");  
   while((c=getchar()) != '\n')  
   {  
     if(c == '\t')  
     {  
         putchar('\\');  
         putchar('t');  
     }  
     putchar(c);  
   }  
   printf("\n");  
   return 0;  
 }  
In above program,
  • We read input character by character
  • Check if its tab (\t). If tab, we output (\\) followed by (t) character which displays \t.
  • If its not tab, we simply output the character.
Output of this program,
 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
 Enter a line to display tabs   
 This  program     displays     the tabs     entered  
 This \t     program\t     displays\t     the tabs\t     entered  

Thursday, May 21, 2015

C : Program to replace multiples spaces with a single space

Earlier we have seen C program to count blanks and tabs , with using the same code, lets try to write another program which takes a line of words/characters and parses through it checking for multiple spaces and replaces them with single space.

For example, in  below line
This     line has      multiple        spaces    between     words.

Ideally while typing a sentence we leave a single space between word-to-word.  This sentence has multiple spaces which can be removed and replaced with single space by our program below.

This line has multiple spaces between words.
  #include <stdio.h>   
  int main()   
  {   
   int c;   
   printf("Enter a line to replace multiple spaces with single space\n");   
   /* Read character by character and prase till end of the file */  
   while((c=getchar()) != '\n')   
   {   
    if(c == ' ')   
    {   
     /* Notice semi-colon at end of below line. It reads a character and if its space  
      it discards it and read character again till its not a space*/  
     while( (c=getchar()) == ' ');   
      /* Now just output single space */  
      putchar(' ');   
    }   
    putchar(c);   
   }   
   printf("\n");   
   return 0;   
  }   
From above code,
  • Using while loop, we start reading character by character with getchar()
  • First character is read and checked for end of the line
  • If not End of the line ('\n'), check if it's a space
  • If its a space, discard spaces by reading them in a while loop.  Note semi-colon at end of the while loop.
  • Now output a single space and read next character.
  • Finally print a new line and return.
Output of this program,
 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
 Enter a line to replace multiple spaces with single space  
 This program  replaces       multiple spaces    with   single  space  
 This program replaces multiple spaces with single space  


Monday, May 18, 2015

C : How to print line numbers and function names while debugging C programs

While debugging C program, we often add printf's to print some values or just a debug print. It would be additional advantage if you can print line number and function name when the program is too big to debug. If you are dealing with multiple files, its good idea to print the file name when you use printf isn't ?

In C, there are few preprocessor macros which allow you to print line numbers, function names and files names.

__LINE__           : Prints line number
__FUNCTION__: Prints function number
__FILE__            : Prints file number

These are predefined macros and part of the C/C++ standard. During preprocessing, they are replaced respectively by a constant string holding an integer representing the current line number and by the current file name.
 #include <stdio.h>  
   
 int main()  
 {  
   printf("\n This program prints line number %d\n",__LINE__);  
   printf("\n This program prints function name %s() \n",__FUNCTION__);  
   printf("\n This program prints file name %s\n",__FILE__);  
   return 0;  
 }  

Output of this program:
 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
   
  This program prints line number 5  
   
  This program prints function name main()   
   
  This program prints file name print_line_function_file_name.c  
   

Others preprocessor variables :

__func__ : function name (this is part of C99, not all C++ compilers support it)
__DATE__ : a string of form "Mmm dd yyyy"
__TIME__ : a string of form "hh:mm:ss"

C : Program to count spaces and tabs

In earlier post "Character counting in C", we have written a program to count number of characters, lets extend this further and count blanks, tabs and newlines.
 int main()   
 {   
   double char_count;   
   char input_char;   
   int tabs_count=0,spaces_count=0;   
   char input_val;   
    
   printf("\n Input a line to count spaces and tabs \n");   
    
   /* Reading a character */   
   input_char = getchar();   
    
   /* In a for loop read characters till end of line is encountered */   
   for(char_count = 0; input_char != '\n'; ++char_count)   
   {   
     /* Check if input character is a space */     
     if(input_char == ' ')   
     {   
       /* if space, increment space variable count */          
       ++spaces_count;     
     }      
     /* else if input character is a tab */   
     else if(input_char == '\t')   
     {   
       /* if tab increment tab count */             
       ++tabs_count;      
     }   
     /* Read next character to check if its space or tab */   
     input_char = getchar();       
   }   
    
   /* Finally print number of spaces and tabs read */   
   printf("\n Your input contains %d spaces, %d tabs \n",spaces_count,tabs_count);   
    
   return 0;    
 }  
   
In above program,
  • We read a character
  • In a for loop till end of line is encountered, we check each character entered is a space (' ') or tab('\t').
  • Increment spaces_count and tabs_count variables accordingly
  • Finally when new line is encountered, we print the space count and tab count.
Output of this program :
 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
   
  Input a line to count spaces and tabs   
 will be giving a tab now     just one tab behind  
   
  Your input contains 9 spaces, 1 tabs  

Linux : How to recursively touch files and folders in a directory

Guys today we will learn a small tip about touching multiple files / folders in a folder using Linux Terminal.

I assume you know about "touch" command in Linux if not please make sure you read its man page here.

You need to use below command to touch files/folders recursively.
 find . -exec touch {} \;  

Execute this command in the director in which you would like to perform this operation.

This command,

  • Finds each file
  • exec is used to invoke a subprocess "touch"
  • Curly bases and backward slash ending with semicolon as per syntax
Hope you find this tip useful.