Rough wind, that moanest loud
Grief too sad for song;
Wild wind, when sullen cloud
Knells all the night long;
Sad storm, whose tears are vain,
Bare woods, whose branches strain,
lang = "en" content = "The Green Table and the Red Chair">
content = "1935">
content = "1939"> 需要注意,本文例中所使用的修饰符语法和标签后缀(跟在元素名和点号后)仅仅反映了修饰符HTML编码的趋势,这种语法格式和后缀既非标准也不建议这么做。 7、DC元素编码 这一节针对不同的DC元素举出了相应的应用实例: Title (给出的资源名) -----
lang = "en" content = "The Author gives some Account of Himself and Family -- His First Inducements to Travel -- He is Shipwrecked, and Swims for his Life -- Gets safe on Shore in the Country of Lilliput -- Is made a Prisoner, and carried up the Country">
content = "A tutorial and reference manual for Java.">
content = "Seated family of five, coconut trees to the left, sailboats moored off sandy beach to the right, with volcano in the background.">
# and outputs it in an alternate format. Issues warning about missing # element name or value. # # Handles mixed case tags and attribute values, one per line or spanning # several lines. Also handles a quoted string spanning multiple lines. # No error checking. Does not tolerate more than one " print "@(urc;
"; while (<>) { next if (! /
($meta) = /( if (! //i) { while (<>) { $meta .= $_; last if (/>/); } } $name = $meta =~ /names*=s*"([^"]*)"/i ? $1 : "MISSING ELEMENT NAME"; $content = $meta =~ /contents*=s*"([^"]*)"/i ? $1 : "MISSING ELEMENT VALUE"; ($scheme) = $meta =~ /schemes*=s*"([^"]*)"/i; ($lang) = $meta =~ /langs*=s*"([^"]*)"/i;
if ($lang || $scheme) { $mod = " ($lang"; if (! $scheme)
content = "(--mbtitle)">
content = "(--mbfilemodtime)">
content = "(--mbbaseURL)/(--mbfilename)">
content = "text/html; (--mbfilesize)">
content = "(--mblanguage)-BUREAUCRATESE">
content = "Springfield Nuclear">
href = "http://purl.org/DC/elements/1.0/">
href = "http://nukes.org/ReactorCore/rc"> 只要把其中的变量引用代入实际值,上面的模板就可作为描述文档的元数据块。 根据我们的脚本,下述变量要同时在模板和文档中替换: (--mbfilesize) size of the final output file (--mbtitle) title of the document (--mblanguage) language of the document (--mbbaseURL) beginning part of document identifier (--mbfilename) last part (minus .html) of identifier (--mbfilemodtime) last modification date of the document 数据挖掘论坛 这是一个应用该脚本的HTML文档:
content = "Memorandum">
From: Acting Shift Supervisor To: Plant Control Personnel RE: (--mbtitle) Date: (--mbfilemodtime)
Pursuant to directive DOH:10.2001/405aec of article B-2022, subsection 48.2.4.4.1c regarding staff morale and employee productivity standards, the current allocation of doughnut acquisition funds shall be increased effective immediately.
Pursuant to directive DOH:10.2001/405aec of article B-2022, subsection 48.2.4.4.1c regarding staff morale and employee productivity standards, the current allocation of doughnut acquisition funds shall be increased effective immediately.
下面是完成这一转换过程的脚本: #!/depot/bin/perl # # This Perl script processes metadata block declarations of the form # and variable references of the # form (--mbVARNAME), replacing them with full metadata blocks and # variable values, respectively. Requires a "template" file. # Outputs an HTML file. # # Invoke this script with a single filename argument, "foo". It creates # an output file "foo.html" using a temporary working file "foo.work". # The size of foo.work is measured after variable replacement, and is # later inserted into the file in such a way that the file′s size does # not change in the process. Has little or no error checking.
$infile = shift; open(IN, "< $infile") or die("Could not open input file "$infile""); $workfile = "$infile.work"; unlink($workfile); open(WORK, "+> $workfile") or die("Could not open work file "$workfile"");
@offsets = (); # records locations for late size replacement $title = ""; # gets the title during metablock processing $language = "en"; # pre-set language here (not in the template) $baseURL = "http://moes.bar.com/doh"; # pre-set base URL here also $filename = "$infile.html"; # final output filename $filesize = "(--mbfilesize)"; # replaced late (separate pass)
sub putout { # outputs current line with variable replacement if (! /(--mb/) { print WORK; return; } if (/(--mbfilesize)/) # remember where it was { push @offsets, tell WORK; } # but don′t replace yet
while () { # main loop for input file if (! /(.*)(.*)//) { $remainder = $1; } else { while () { $title .= $_; last if (/(.*)s*-->(.*)/); } $title .= $1; $remainder = $2; } open(TPLATE, "< template") or die("Could not open template file"); while () # subloop for template file { &putout; } close(TPLATE); $_ = $remainder; &putout;
} close(IN);
# Now replace filesize variables without altering total byte count. select( (select(WORK), $| = 1) [0] ); # first flush output so we if (($size = -s WORK) < 100000) # can get final file size 数据挖掘论坛 { $scale = 0; } # and set scale factor or else { # compute it, keeping width of size field low for ($scale = 0; $size >= 1000; $scale++) { $size /= 1024; } } $filesize = sprintf "%7.7s %sbytes", $size, (" ", "K", "M", "G", "T", "P") [$scale];
foreach $pos (@offsets) { # loop through saved size locations seek WORK, $pos, 0; # read the line found there $_ = ; # $filesize must be exactly as wide as "(--mbfilesize)" s/(--mbfilesize)/$filesize/g; seek WORK, $pos, 0; # rewrite it with replacement print WORK; }
close(WORK); rename($workfile, "$filename") or die("Could not rename "$workfile" to "$filename""); # ---- end of Perl script ---- 数据挖掘实验室