Showing posts with label C program. Show all posts
Showing posts with label C program. Show all posts

Wednesday, June 24, 2015

C Programming: Pitfalls while word counting

In Earlier post "Count number of characters, words and lines in input", we have seen how to count number of characters,words and lines. Today, we will learn about pitfalls during word counting.

In K&R, we have a separate exercise for this. Exercise 1.14 states "How would you test the word count program? What kinds of input are most likely to uncover bugs if there are any?"

While writing this program earlier, we never handled much cases. There are caveats which needs to be identified and addressed.

First lets list what all possible checks we usually need while word counting.

1. Checking for very short words.
2. Check for very lengthy words.
3. Check for words which separated when new line is encountered. For example kernel
trap. Where kernel is at end of a line and trap follows in next line.4. Considering words like "isn't", "tour's" as single words
5. Check overall files size for size less than 2GB.
6. Check for mistyped words like "kernel  -  trap" which contain spaces in middle or an - instead of space ex.kernel-trap.
7. Check of non ASCII characters
8. Check for different encoding

Please shed your thoughts if I have missing any checks.

Monday, June 1, 2015

C Program : Count number of characters, words and lines in input

Earlier, we have seen "C : Program to display tabs , backspaces visible in an unambiguous way", today we will write a small C program which counts number of characters ,  number of words and number of lines.

Program looks easy when they ask to count lines and numbers but how about words.

If input to stdin is character by character, then we can count characters easily. Also using '\n' , we can identify that line has encountered. Only problem is counting words.

So, we will use a mechanism to find if word has encountered. Lets introduce two states IN and OUT which states currently process word and the other says its out of word respectively.

 #include  
   
 #define IN 1  
 #define OUT 0  
   
 int main()  
 {  
   int c, nl, nw, nc, state;  
   
   state = OUT;  
   nl = nw = nc = 0;  
   
   while((c = getchar()) != EOF)  
   {  
     /* Increment number of characters */  
     ++nc;  
   
     /* Increment number of lines if end of line is encountered */  
     if( c == '\n' )  
     {  
       ++nl;  
     }  
     /* Anything other than character, mark it new word */  
     if( c == ' ' || c == '\n' || c == '\t' )  
     {  
       state = OUT;   /* Its just completed processing a word */
     } /* if new word, increment word count */  
     else if ( state == OUT )  
     {  
       state = IN;  
       ++nw;  
     }  
   }  
   printf(" Number of Characters = %d \n",nc);  
   printf(" Number of lines = %d \n",nl);  
   printf(" Number of Words = %d \n",nw);  
   return 0;  
 }  
   

Output of this program

 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
 This program counts  
 number of characters  
 number of words  
 number of lines  
  Number of Characters = 73   
  Number of lines = 4   
  Number of Words = 12   

In this program

  • We read input character by character 
  • First we increment the character (nc)
  • We then check if it is a new line. If new line, we increment the character
  • Now to increment word (collection of characters), we need to do multiple checks which signify end of word. 
  • If its new line or tab or space, we consider that as word and increment word
  • Finally when you press Ctrl+D (in Linux), the program will display the number of characters, number of lines and number of words.
Hope this helped :)


Monday, May 25, 2015

C : Program to display tabs , backspaces visible in an unambiguous way

In my earlier post, we have seen "C : Program to replace multiples spaces with a single space" , today lets modify the same code to display tabs, backspaces.

Below is the program which does this job.
 #include   
 int main()  
 {  
   char c;  
   printf("Enter a line to display tabs \n");  
   while((c=getchar()) != '\n')  
   {  
     if(c == '\t')  
     {  
         putchar('\\');  
         putchar('t');  
     }  
     putchar(c);  
   }  
   printf("\n");  
   return 0;  
 }  
In above program,
  • We read input character by character
  • Check if its tab (\t). If tab, we output (\\) followed by (t) character which displays \t.
  • If its not tab, we simply output the character.
Output of this program,
 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
 Enter a line to display tabs   
 This  program     displays     the tabs     entered  
 This \t     program\t     displays\t     the tabs\t     entered  

Monday, May 18, 2015

C : How to print line numbers and function names while debugging C programs

While debugging C program, we often add printf's to print some values or just a debug print. It would be additional advantage if you can print line number and function name when the program is too big to debug. If you are dealing with multiple files, its good idea to print the file name when you use printf isn't ?

In C, there are few preprocessor macros which allow you to print line numbers, function names and files names.

__LINE__           : Prints line number
__FUNCTION__: Prints function number
__FILE__            : Prints file number

These are predefined macros and part of the C/C++ standard. During preprocessing, they are replaced respectively by a constant string holding an integer representing the current line number and by the current file name.
 #include <stdio.h>  
   
 int main()  
 {  
   printf("\n This program prints line number %d\n",__LINE__);  
   printf("\n This program prints function name %s() \n",__FUNCTION__);  
   printf("\n This program prints file name %s\n",__FILE__);  
   return 0;  
 }  

Output of this program:
 mrtechpathi@mrtechpathi:~/Study/C/K_and_R$ ./a.out   
   
  This program prints line number 5  
   
  This program prints function name main()   
   
  This program prints file name print_line_function_file_name.c  
   

Others preprocessor variables :

__func__ : function name (this is part of C99, not all C++ compilers support it)
__DATE__ : a string of form "Mmm dd yyyy"
__TIME__ : a string of form "hh:mm:ss"