• Lucid Dreaming - Dream Views




    Results 1 to 7 of 7
    1. #1
      FBI agent Ynot's Avatar
      Join Date
      Oct 2005
      Gender
      Location
      Southend, Essex
      Posts
      4,337
      Likes
      14

      Bash Script Fu....

      I know maybe not the best forum to ask
      but I know most people here, rather than asking in another place and not getting any answers

      anyway, I have this:
      Code:
      #!/bin/sh
      
      echo "To: [email protected]
      Subject: JG Server - Weekly Checkup
      ***** JG Server - Weekly Checkup *****
      " > /tmp/jg_weekly.txt
      
      uptime >> /tmp/jg_weekly.txt
      
      echo -n "
      Hostname:        " >> /tmp/jg_weekly.txt
      hostname >> /tmp/jg_weekly.txt
      
      echo -n "Hostname (FQDN):    " >> /tmp/jg_weekly.txt
      hostname -f >> /tmp/jg_weekly.txt
      
      echo "" >> /tmp/jg_weekly.txt
      
      df -h --type=ext3 >> /tmp/jg_weekly.txt
      
      echo "
      Location    Size    Mounted on" >> /tmp/jg_weekly.txt
      
      echo -n "E: Drive:    " >> /tmp/jg_weekly.txt
      du -hs /media/data/data >> /tmp/jg_weekly.txt
      
      echo -n "    *.doc =    " >> /tmp/jg_weekly.txt
      find /media/data/data -type f -iname '*.doc' -exec du -k {} \; | awk '{sum+=$1} END {split("K,M,G,T", Units, ",");u = 1;while (sum >= 1024){sum = sum / 1024;u += 1}sum = sprintf("%.1f%s", sum, Units[u]);print sum;}' >> /tmp/jg_weekly.txt
      echo -n "    *.xls =    " >> /tmp/jg_weekly.txt
      find /media/data/data -type f -iname '*.xls' -exec du -k {} \; | awk '{sum+=$1} END {split("K,M,G,T", Units, ",");u = 1;while (sum >= 1024){sum = sum / 1024;u += 1}sum = sprintf("%.1f%s", sum, Units[u]);print sum;}' >> /tmp/jg_weekly.txt
      echo -n "    *.jpg =    " >> /tmp/jg_weekly.txt
      find /media/data/data -type f -iname '*.jpg' -exec du -k {} \; | awk '{sum+=$1} END {split("K,M,G,T", Units, ",");u = 1;while (sum >= 1024){sum = sum / 1024;u += 1}sum = sprintf("%.1f%s", sum, Units[u]);print sum;}' >> /tmp/jg_weekly.txt
      echo -n "    *.dwg =    " >> /tmp/jg_weekly.txt
      find /media/data/data -type f -iname '*.dwg' -exec du -k {} \; | awk '{sum+=$1} END {split("K,M,G,T", Units, ",");u = 1;while (sum >= 1024){sum = sum / 1024;u += 1}sum = sprintf("%.1f%s", sum, Units[u]);print sum;}' >> /tmp/jg_weekly.txt
      echo -n "    *.pdf =    " >> /tmp/jg_weekly.txt
      find /media/data/data -type f -iname '*.pdf' -exec du -k {} \; | awk '{sum+=$1} END {split("K,M,G,T", Units, ",");u = 1;while (sum >= 1024){sum = sum / 1024;u += 1}sum = sprintf("%.1f%s", sum, Units[u]);print sum;}' >> /tmp/jg_weekly.txt
      
      echo -n "Mailboxes:    " >> /tmp/jg_weekly.txt
      du -hs /var/mail/virtual >> /tmp/jg_weekly.txt
      
      echo -n "Web Pages:    " >> /tmp/jg_weekly.txt
      du -hs /var/www >> /tmp/jg_weekly.txt
      
      cat /tmp/jg_weekly.txt | /usr/sbin/sendmail [email protected]
      rm /tmp/jg_weekly.txt
      which outputs:
      Code:
      ***** JG Server - Weekly Checkup *****
      
       16:01:59 up 7 days, 22:13,  1 user,  load average: 0.93, 1.02, 0.81
      
      Hostname:        server1
      Hostname (FQDN):    server1.blah.co.uk
      
      Filesystem            Size  Used Avail Use% Mounted on
      /dev/sda2              18G  3.1G   14G  19% /
      /dev/sdb1              57G   29G   26G  53% /media/data
      
      Location    Size    Mounted on
      E: Drive:    25G    /media/data/data
          *.doc =    2.5G
          *.xls =    24.9M
          *.jpg =    16.2G
          *.dwg =    688.6M
          *.pdf =    1.1G
      Mailboxes:    3.1G    /var/mail/virtual
      Web Pages:    119M    /var/www
      and the thing takes 10+ mins to run
      and while it's only going to be run once a week (midnight on sundays), I'd like to speed it up if I can
      I know I'm horribly inefficient calculating the different file type totals
      does anyone have any advice to speed it up?

      Thanks
      (\_ _/)
      (='.'=)
      (")_(")

    2. #2
      Banned
      Join Date
      Apr 2007
      Location
      Out Chasing Rabbits
      Posts
      15,193
      Likes
      935
      You're summing everything up, why aren't you using the -s parameter instead of -k?

      Also, the -h parameter will put it into MB and GB for you (won't it?)


      Wait for conformation from somebody else before quoting me on that. I'm not much of a UNIX or script guy. I may have misunderstood my UNIX book. Want gfx help instead?

    3. #3
      FBI agent Ynot's Avatar
      Join Date
      Oct 2005
      Gender
      Location
      Southend, Essex
      Posts
      4,337
      Likes
      14
      Quote Originally Posted by ninja9578 View Post
      You're summing everything up, why aren't you using the -s parameter instead of -k?
      I'm summing the main file types across our network share

      du -s will give me a basic total of disk space usage, but for all files / directories (not split down by extension)

      Quote Originally Posted by ninja9578 View Post
      Also, the -h parameter will put it into MB and GB for you (won't it?)
      Yes, -h will, but currently, I need everything in a standard denomination (kb's) so I can sum them properly

      (indeed, I'm using -s & -h to do the "E:\ Drive" summary before the filetype breakdowns)

      I was just wondering if there was a better way to do it?
      Last edited by Ynot; 03-04-2008 at 08:28 PM.
      (\_ _/)
      (='.'=)
      (")_(")

    4. #4
      Member Achievements:
      1 year registered Veteran First Class 5000 Hall Points

      Join Date
      Sep 2004
      Gender
      Location
      Seattle, WA
      Posts
      2,503
      Likes
      217
      What's taking longer, the du or the finds? One obvious optimization is to run find only once, though I would think after the first run, it would go faster due to caching. That's something to check into. You should echo something to stdout between operations as well, just to get a feel for what's taking so long. If it's the repeated find, then you will definitely benefit from it.

      If it's the 'du' that's causing problems (and I have no idea whether you have a large number of little files or a small number of large files), then two possible solutions to consider are:

      1) mount /media/data/data separately, and you can just run df, which is quick (this requires a separate partition though...)
      2) Depending on how your stuff is laid out, you can run 'du' on everything in /media/data EXCEPT /media/data/data and subtract it from your DF results above. Of course, you'll have to rerun DF to get a block-level size and not the human readable, and then muck with the results to format them the right way, but that's no longer a performance limitation.

      More sophisticated options might be some way to keep track of size dynamically, but it may not be worth it. Are you expecting this thing to scale to a point where it takes hours to run? A few minutes may not be the end of the world. Maybe if you install disk quotas and enforce quotas on the directory (no max), then it will be kept track of for you and you can just run the disk quota utils to get how much space is being used, but I'm not certain that happens on a directory basis or a user basis.
      Last edited by Replicon; 03-04-2008 at 08:37 PM.

    5. #5
      FBI agent Ynot's Avatar
      Join Date
      Oct 2005
      Gender
      Location
      Southend, Essex
      Posts
      4,337
      Likes
      14
      Quote Originally Posted by Replicon View Post
      What's taking longer, the du or the finds? One obvious optimization is to run find only once, though I would think after the first run, it would go faster due to caching. That's something to check into. You should echo something to stdout between operations as well, just to get a feel for what's taking so long. If it's the repeated find, then you will definitely benefit from it.

      If it's the 'du' that's causing problems (and I have no idea whether you have a large number of little files or a small number of large files), then two possible solutions to consider are:

      1) mount /media/data/data separately, and you can just run df, which is quick (this requires a separate partition though...)
      2) Depending on how your stuff is laid out, you can run 'du' on everything in /media/data EXCEPT /media/data/data and subtract it from your DF results above. Of course, you'll have to rerun DF to get a block-level size and not the human readable, and then muck with the results to format them the right way, but that's no longer a performance limitation.

      More sophisticated options might be some way to keep track of size dynamically, but it may not be worth it. Are you expecting this thing to scale to a point where it takes hours to run? A few minutes may not be the end of the world. Maybe if you install disk quotas and enforce quotas on the directory (no max), then it will be kept track of for you and you can just run the disk quota utils to get how much space is being used, but I'm not certain that happens on a directory basis or a user basis.
      thanks,
      lots (and lots) of small files - almost everything is under 10Mb in size
      it's the finds that take the time
      but again, using df would not give me the breakdowns according to extension

      I think multiple finds is the only way to do this by extensions
      (short of a daemon monitoring writes & updating a total throughout the week - which I really don't want to do)

      Age old curse,
      Apps programmers make poor sys-admins
      and sys-admins make poor application programmers

      thanks both for having a look
      (\_ _/)
      (='.'=)
      (")_(")

    6. #6
      Member Achievements:
      1 year registered Veteran First Class 5000 Hall Points

      Join Date
      Sep 2004
      Gender
      Location
      Seattle, WA
      Posts
      2,503
      Likes
      217
      Oh whoops I completely missed the du -k and awk-fu at the end of each find.

      Have you considered using "ls -lR" instead of "find + du"?

    7. #7
      dsr
      dsr is offline
      我是老外,可是我會說一點中文。
      Join Date
      Nov 2006
      Gender
      Location
      my mind
      Posts
      374
      Likes
      1
      It really seems like there should be a way to eliminate all the awk stuff, but I can't help with the actual scripting. However, you might want to run the script in verbose mode (sh -v [filename]) so you can spot the bottleneck. I think we already know where it is, but you never know for sure.

    Bookmarks

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •