You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

117 lines
6.0 KiB

<!DOCTYPE html>
<html lang="en">
<meta charset="utf-8">
<title>YBlog - split a file by keyword</title>
<meta name="keywords" content="awk, shell, script" />
<link rel="shortcut icon" type="image/x-icon" href="../../../../Scratch/img/favicon.ico" />
<link rel="stylesheet" type="text/css" href="/css/y.css" />
<link rel="stylesheet" type="text/css" href="/css/legacy.css" />
<link rel="alternate" type="application/rss+xml" title="RSS" href="/rss.xml" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="apple-touch-icon" href="../../../../Scratch/img/about/FlatAvatar@2x.png" />
<!--[if lt IE 9]>
<script src=""></script>
<!-- IndieAuth -->
<link href="" rel="me">
<link href="" rel="me">
<link href="" rel="me">
<link rel="pgpkey" href="../../../../pubkey.txt">
<body lang="en" class="article">
<div id="content">
<div id="header">
<div id="choix">
<span id="choixlang">
<a href="../../../../Scratch/fr/blog/2010-02-18-split-a-file-by-keyword/">French</a>
<span class="tomenu"><a href="#navigation">↓ Menu ↓</a></span>
<span class="flush"></span>
<div id="titre">
<h1>split a file by keyword</h1>
<div class="flush"></div>
<div id="afterheader" class="article">
<div class="corps">
<p>Strangely enough, I didn’t find any built-in tool to split a file by keyword. I made one myself in <code>awk</code>. I put it here mostly for myself. But it could also helps someone else. The following code split a file for each line containing the word <code>UTC</code>.</p>
<code class="perl"> #!/usr/bin/env awk BEGIN{i=0;} /UTC/ { i+=1; FIC=sprintf(“fic.%03d”,i); } {print $0&gt;&gt;FIC} </code>
<p>In my real world example, I wanted one file per day, each line containing UTC being in the following format:</p>
<pre class="twilight">
Mon Dec 7 10:32:30 UTC 2009
<p>I then finished with the following code:</p>
<code class="perl"> #!/usr/bin/env awk BEGIN{i=0;} /UTC/ { date=$1$2$3; if ( date != olddate ) { olddate=date; i+=1; FIC=sprintf(“fic.%03d”,i); } } {print $0&gt;&gt;FIC} </code>
<div id="afterarticle">
<div id="social">
<a href="/rss.xml" target="_blank" rel="noopener noreferrer nofollow" class="social">RSS</a>
<a href="" target="_blank" rel="noopener noreferrer nofollow" class="social">Tweet</a>
<a href="" target="_blank" rel="noopener noreferrer nofollow" class="social">FB</a>
<br />
<a class="message" href="../../../../Scratch/en/blog/Social-link-the-right-way/">These social sharing links preserve your privacy</a>
<div id="navigation">
<a href="../../../../">Home</a>
<span class="sep">¦</span>
<a href="../../../../Scratch/en/blog">Blog</a>
<span class="sep">¦</span>
<a href="../../../../Scratch/en/softwares">Softwares</a>
<span class="sep">¦</span>
<a href="../../../../Scratch/en/about">About</a>
<div id="totop"><a href="#header">↑ Top ↑</a></div>
<div id="bottom">
Published on 2010-02-18
<a href="">Follow @yogsototh</a>
<a rel="license" href="">Yann Esposito©</a>
Done with
<a href="" target="_blank" rel="noopener noreferrer nofollow"><strike>Vim</strike></a>
<a href="" target="_blank" rel="noopener noreferrer nofollow">spacemacs</a>
<span class="pala">&amp;</span>
<a href="" target="_blank" rel="noopener noreferrer nofollow"><strike>nanoc</strike></a>
<a href="" target="_blank" rel="noopener noreferrer nofollow">Hakyll</a>
<hr />
<div style="max-width: 100%">
<a href="">
<img src="../../../../Scratch/img/ada-logo.png" class="simple" style="height: 16px;
border-radius: 50%;
display:inline-block;" />
<code style="display:inline-block;
text-align: left;
vertical-align: top;
max-width: 85%;">