her.esy.fun/src/Scratch/en/blog/2010-02-23-When-regexp-is-n.../index.html

143 lines
8.1 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>YBlog - When regexp is not the best solution</title>
<meta name="keywords" content="programming, regexp, regular expression, extension, file" />
<link rel="shortcut icon" type="image/x-icon" href="../../../../Scratch/img/favicon.ico" />
<link rel="stylesheet" type="text/css" href="/css/y.css" />
<link rel="stylesheet" type="text/css" href="/css/legacy.css" />
<link rel="alternate" type="application/rss+xml" title="RSS" href="/rss.xml" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="apple-touch-icon" href="../../../../Scratch/img/about/FlatAvatar@2x.png" />
<!--[if lt IE 9]>
<script src="http://ie7-js.googlecode.com/svn/version/2.1(beta4)/IE9.js"></script>
<![endif]-->
<!-- IndieAuth -->
<link href="https://twitter.com/yogsototh" rel="me">
<link href="https://github.com/yogsototh" rel="me">
<link href="mailto:yann.esposito@gmail.com" rel="me">
<link rel="pgpkey" href="../../../../pubkey.txt">
</head>
<body lang="en" class="article">
<div id="content">
<div id="header">
<div id="choix">
<span id="choixlang">
<a href="../../../../Scratch/fr/blog/2010-02-23-When-regexp-is-not-the-best-solution/">French</a>
</span>
<span class="tomenu"><a href="#navigation">↓ Menu ↓</a></span>
<span class="flush"></span>
</div>
</div>
<div id="titre">
<h1>When regexp is not the best solution</h1>
</div>
<div class="flush"></div>
<div id="afterheader" class="article">
<div class="corps">
<p>Regular expression are really useful. Unfortunately, they are not always the best way of doing things. Particularly when transformations you want to make are easy.</p>
<p>I wanted to know how to get file extension from filename the fastest way possible. There is 3 natural way of doing this:</p>
<div>
<p><code class="ruby"> # regexp str.match(/[^.]*<span class="math inline">/);<em>e</em><em>x</em><em>t</em>=</span>&amp;</p>
<h1 id="split">split</h1>
<p>ext=str.split(.)[-1]</p>
<h1 id="file-module">File module</h1>
ext=File.extname(str) </code>
</div>
<p>At first sight I believed that the regexp should be faster than the split because it could be many <code>.</code> in a filename. But in reality, most of time there is only one dot and I realized the split will be faster. But not the fastest way. There is a function dedicated to this work in the <code>File</code> module.</p>
<p>Here is the Benchmark ruby code:</p>
<div>
<p><code class="ruby" file="regex_benchmark_ext.rb"> #!/usr/bin/env ruby require benchmark n=80000 tab=[ /accounts/user.json, /accounts/user.xml, /user/titi/blog/toto.json, /user/titi/blog/toto.xml ]</p>
puts “Get extname” Benchmark.bm do |x| x.report(“regexp:”) { n.times do str=tab[rand(4)]; str.match(/[^.]*<span class="math inline">/);<em>e</em><em>x</em><em>t</em>=</span>&amp;; end } x.report(" split:“) { n.times do str=tab[rand(4)]; ext=str.split(.)[-1] ; end } x.report(” File:") { n.times do str=tab[rand(4)]; ext=File.extname(str); end } end </code>
</div>
<p>And here is the result</p>
<pre class="twilight">
Get extname
user system total real
regexp: 2.550000 0.020000 2.570000 ( 2.693407)
split: 1.080000 0.050000 1.130000 ( 1.190408)
File: 0.640000 0.030000 0.670000 ( 0.717748)
</pre>
<p>Conclusion of this benchmark, dedicated function are better than your way of doing stuff (most of time).</p>
<h2 id="file-path-without-the-extension.">file path without the extension.</h2>
<div>
<p><code class="ruby" file="regex_benchmark_strip.rb"> #!/usr/bin/env ruby require benchmark n=80000 tab=[ /accounts/user.json, /accounts/user.xml, /user/titi/blog/toto.json, /user/titi/blog/toto.xml ]</p>
puts “remove extension” Benchmark.bm do |x| x.report(" File:“) { n.times do str=tab[rand(4)]; path=File.expand_path(str,File.basename(str,File.extname(str))); end } x.report(”chomp:") { n.times do str=tab[rand(4)]; ext=File.extname(str); path=str.chomp(ext); end } end </code>
</div>
<p>and here is the result:</p>
<pre class="twilight">
remove extension
user system total real
File: 0.970000 0.060000 1.030000 ( 1.081398)
chomp: 0.820000 0.040000 0.860000 ( 0.947432)
</pre>
<p>Conclusion of the second benchmark. One simple function is better than three dedicated functions. No surprise, but it is good to know.</p>
</div>
<div id="afterarticle">
<div id="social">
<a href="/rss.xml" target="_blank" rel="noopener noreferrer nofollow" class="social">RSS</a>
·
<a href="https://twitter.com/home?status=http%3A%2F%2Fyannesposito.com/Scratch/en/blog/2010-02-23-When-regexp-is-not-the-best-solution/%20via%20@yogsototh" target="_blank" rel="noopener noreferrer nofollow" class="social">Tweet</a>
·
<a href="http://www.facebook.com/sharer/sharer.php?u=http%3A%2F%2Fyannesposito.com/Scratch/en/blog/2010-02-23-When-regexp-is-not-the-best-solution/" target="_blank" rel="noopener noreferrer nofollow" class="social">FB</a>
<br />
<a class="message" href="../../../../Scratch/en/blog/Social-link-the-right-way/">These social sharing links preserve your privacy</a>
</div>
<div id="navigation">
<a href="../../../../">Home</a>
<span class="sep">¦</span>
<a href="../../../../Scratch/en/blog">Blog</a>
<span class="sep">¦</span>
<a href="../../../../Scratch/en/softwares">Softwares</a>
<span class="sep">¦</span>
<a href="../../../../Scratch/en/about">About</a>
</div>
<div id="totop"><a href="#header">↑ Top ↑</a></div>
<div id="bottom">
<div>
Published on 2010-02-23
</div>
<div>
<a href="https://twitter.com/yogsototh">Follow @yogsototh</a>
</div>
<div>
<a rel="license" href="http://creativecommons.org/licenses/by/3.0/deed.en_US">Yann Esposito©</a>
</div>
<div>
Done with
<a href="http://www.vim.org" target="_blank" rel="noopener noreferrer nofollow"><strike>Vim</strike></a>
<a href="http://spacemacs.org" target="_blank" rel="noopener noreferrer nofollow">spacemacs</a>
<span class="pala">&amp;</span>
<a href="http://nanoc.ws" target="_blank" rel="noopener noreferrer nofollow"><strike>nanoc</strike></a>
<a href="http://jaspervdj.be/hakyll" target="_blank" rel="noopener noreferrer nofollow">Hakyll</a>
</div>
<hr />
<div style="max-width: 100%">
<a href="https://cardanohub.org">
<img src="../../../../Scratch/img/ada-logo.png" class="simple" style="height: 16px;
border-radius: 50%;
vertical-align:middle;
display:inline-block;" />
ADA:
</a>
<code style="display:inline-block;
word-wrap:break-word;
text-align: left;
vertical-align: top;
max-width: 85%;">
DdzFFzCqrhtAvdkmATx5Fm8NPJViDy85ZBw13p4XcNzVzvQg8e3vWLXq23JQWFxPEXK6Kvhaxxe7oJt4VMYHxpA2vtCFiP8fziohN6Yp
</code>
</div>
</div>
</div>
</div>
</div>
</body>
</html>