跳转到内容

Here文档

本页使用了标题或全文手工转换
维基百科,自由的百科全书

here文档[1],又称作heredochereishere-字符串here-脚本,是一种在命令行shell(如shcshkshbashPowerShellzsh)和程序语言(像PerlPHPPythonRuby)里定义一个字符串的方法。它可以保存文字里面的换行或是缩进等空白字元。一些语言允许在字符串里执行变量替换和命令替换

here文档最通用的语法是<<紧跟一个标识符,从下一行开始是想要引用的文字,然后再在单独的一行用相同的标识符关闭。在Unix shell里,here文档通常用于给命令提供输入内容。

实例

以下几节提供了不同语言和环境中的例子。

命令行 shell

Unix shell

在以下几个例子中,文字用here文档传递给tr命令。

 $ tr a-z A-Z <<END_TEXT
 > one two three
 > uno dos tres
 > END_TEXT
 ONE TWO THREE
 UNO DOS TRES

[2]

END_TEXT被用作标识符。它指定了here文档的开始和结束ONE TWO THREEUNO DOS TRES是执行后tr的输出。

在<<后面添加一个减号,可以使TAB字元被忽略。这允许在shell脚本中缩进here文档而不改变它们的值。(注意在命令行上可能会需要输入Ctrl-v TAB来真正地输入一个制表符。下边的例子用空格模拟制表符;不要复制粘贴。)

 $ tr a-z A-Z <<-END_TEXT
 >         one two three
 >         uno dos tres
 > END_TEXT
 ONE TWO THREE
 UNO DOS TRES

默认地,会进行变量替换和命令替换:

 $ cat << EOF
 > Working dir $PWD
 > EOF
 Working dir /home/user

这可以通过使用引号包裹标识符来禁用。可以使用单引号或双引号:

 $ cat << "EOF"
 > Working dir $PWD
 > EOF
 Working dir $PWD

bash,ksh或zsh中也可以用here-字符串:

 $ tr a-z A-Z <<<"Yes it is a string"
 YES IT IS A STRING

Windows 命令行

等价的代码目前没有找到。下列代码较为有用。

set GREETING=Hello
echo %GREETING%
cmd /k 
  echo %GREETING%
  set GREETING=Goodbye
  echo %GREETING% 
exit
echo %GREETING%

C:\>
C:\>set GREETING=Hello

C:\>echo %GREETING%
Hello

C:\>cmd /k
C:\>  echo %GREETING%
Hello

C:\>  set GREETING=Goodbye

C:\>  echo %GREETING%
Goodbye

C:\>exit

C:\>echo %GREETING%
Hello

C:\>

Windows PowerShell

Windows PowerShell里,here文档表示的是here-字符串。一个here-字符串是由@"@'开始,由独立成行的"@'@结束的字符串。所有在开始符号和结束符号之间的字符都被当做字面的字符串[3]

使用双引号引起来的here-字符串允许变量替换,而单引号不行[4]

变量替换只发生于简单变量(如$x,但不是$x.y$x[0])。

可以将命令放进$()中来获取执行结果。

在如下的PowerShell的代码中,文字使用here-字符串传递给一个函数。这个函数ConvertTo-UpperCase定义如下:

PS> function ConvertTo-UpperCase($string) { $string.ToUpper() }
PS> ConvertTo-UpperCase @'
>> one two three
>> eins zwei drei
>> '@
>>
ONE TWO THREE
EINS ZWEI DREI

下边是一个证明了双引号的here-字符串里的变量替换和命令替换的例子:

$doc, $marty = 'Dr. Emmett Brown', 'Marty McFly'
$time = [DateTime]'Friday, October 25, 1985 8:00:00 AM'
$diff = New-TimeSpan -Minutes 25
@"
$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString())?
$doc : Precisely.
$marty : Damn! I'm late for school!
"@

输出:

Dr. Emmett Brown : Are those my clocks I hear?
Marty McFly : Yeah! Uh, it's 8 o'clock!
Dr. Emmett Brown : Perfect! My experiment worked! They're all exactly 25 minutes slow.
Marty McFly : Wait a minute. Wait a minute. Doc... Are you telling me that it's 08:25?
Dr. Emmett Brown : Precisely.
Marty McFly : Damn! I'm late for school!

如果用单引号的here-字符串代替,输出看起来会像这样:

$doc : Are those my clocks I hear?
$marty : Yeah! Uh, it's $($time.Hour) o'clock!
$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.
$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString())?
$doc : Precisely.
$marty : Damn! I'm late for school!

编程语言

C++

C++11引入了原始字面字符串。原始字面字符串的前缀有一个“R”,以"分隔符(开始,以)分隔符"结束。分隔符可以是0到16字符长,可以包括简单的字符,除开空格,括号与反斜杠。

char const *a = R"(The escape sequence '\n' represents a newline character.)";

wchar_t const *b = LR"...(Raw strings look like R"(...)")...";

char16_t const *b = uR"xyz(
Universal character names such as "\u5367\u864E\u85CF\u3863" are not
processed in raw string literals. Therefore the above can be written
as "臥虎藏龍" in a raw string literal, but only if the source character
set contains those characters.
)xyz";

D语言

从2.0版本开始,D语言支持用“q”引导的here-字符串。这些字符串以一个括号(<>,[],(),{})或者单独成行的标识符开始。

下列D代码展示了使用括号和标识符的here-字符串。

int main() {
    string list = q"[1. Item One
2. Item Two
3. Item Three]";
    writef( list );
}

使用标识符:

int main() {
    string list = q"IDENT
1. Item One
2. Item Two
3. Item Three
IDENT";
    writef( list );
}

Lua

Lua使用[[]]定义字面字符串,字面字符串中的换行会原样保留,不允许含有转义字符。这不便放置长的注释(--[[注释]])和一些字符串(x = a[b[c]])。所以在版本5.1时,Lua添加了一个新语法:起始的两个括号中间可以加入任意多的等号,并且只有相同的等号数字才能关闭字符串。

local ls = [[
Initial newline isn't part of the string.
Two lines.]]
local lls = [==[
This notation can be used for Windows paths: 
local path = [=[C:\Windows\Fonts]=]
]==]

Perl

在Perl里有许多不同的方法使用here文档[5]。在here文档的标签名前后加括号的效果和一般的字面字符串效果是一样的:标签前后加双引号允许变量扩展,单引号则不行,不加引号的和加双引号的效果一样。加反引号将会把here文档当做shell脚本执行,并获取输出。需要保证结束标签必须在一行的开始,不然这个标签不会被直译器认出。

注意here文档不是从标签开始的,而是从下一行开始的。所以包含标签的语句将会在标签后继续。

这是一个使用双引号的例子:

my $sender = "Buffy the Vampire Slayer";
my $recipient = "Spike";

print <<"END";

Dear $recipient, 

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender

END

输出:

Dear Spike,

I wish you to leave Sunnydale and never return.

Not Quite Love,
Buffy the Vampire Slayer

这是使用单引号的例子:

print <<'END';
Dear $recipient,

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender
END

输出:

Dear $recipient,

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender

另外一个使用反引号的例子(可能不具有可移植性):

my $shell_script_stdout = <<`END`;
echo foo
echo bar
END

可以在同一行上开始多个here文档:

say(<<BEGIN . "this is the middle\n" . <<END);
This is the beginning:
BEGIN
And now it is over!
END

#上边的和这个相同:
say("This is the beginning:\nthis is the middle\nAnd now it is over!");

标签本身可以使用空格,这允许here文档不会破坏缩进

  say <<'  END';
Hello World
  END

PHP

<?php
 
$name       = "Joe Smith";
$occupation = "Programmer";
echo <<<EOF

	This is a heredoc section.
	For more information talk to $name, your local $occupation.

	Thanks!

EOF;

$toprint = <<<EOF

	Hey $name! You can actually assign the heredoc section to a variable!

EOF;
echo $toprint;

?>

输出:

This is a heredoc section.
For more information talk to Joe Smith, your local Programmer.
 
Thanks!
  
Hey Joe Smith! You can actually assign the heredoc section to a variable!

包含关闭标识符的行不得包含除了(可选的)分号的任何其他字符。不然它就不会被识别为关闭标识符,PHP就会继续寻找一个。如果没有找到关闭标识符,分析错误会发生在最后一行[6]

在PHP 5.3和以后的版本中,就像Perl一样,可以用单引号包裹标识符阻止变量扩展;这叫作nowdoc[7]

$x = <<<'END'
Dear $recipient,

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender
END;

在PHP5.3和以后的版本中,也可以用双引号包裹标识符,像Perl一样,和不用引号的效果一样。

Python

Python支持使用三个连续单引号或双引号的字面字符串(如'''""")。这些字面字符串可以跨越多行,支持here文档的功能。

一个简单的Python3兼容的例子给出像上边第一个Perl例子一样:

message="""Dear {recipient},

I wish you to leave Sunnydale and never return.

Not Quite Love,
{sender}
"""
print(message.format(sender='Buffy the Vampire Slayer', recipient='Spike'))

在Python3.0以前的版本中,用print关键字代替print函数。

R

R语言在字符串里使用空格,包括换行。不执行变量替换。字符串可以用textConnection()函数转化为文件描述符。例如,以下代码将一个嵌入源码的数据表转化为一个数据框架变量:

str <-
"State          Population Income Illiteracy Life.Exp Murder HS.Grad Frost
Alabama              3615   3624        2.1    69.05   15.1    41.3    20
Alaska                365   6315        1.5    69.31   11.3    66.7   152
Arizona              2212   4530        1.8    70.55    7.8    58.1    15
Arkansas             2110   3378        1.9    70.66   10.1    39.9    65"
x <- read.table(textConnection(str), header=TRUE, row.names=1)

Racket

Racket的here字符串以#<<开始,紧跟定义字符串终止的标识符[8]

字符串的内容包括所有的在#<<一行和仅包括定义了的终止符的那一行。即:字符串的内容开始于#<<后的新行,结束于终止符之前的一行。

#lang racket

(displayln
 #<<HERESTRING
This is a simple here string in Racket.
  * One
  * Two
  * Three
HERESTRING
 )

输出:

This is a simple here string in Racket.
  * One
  * Two
  * Three

here字符串中的转义字符不被识别;字符串(和终止符)中所有的字符都会保持原样。

#lang racket

(displayln
 #<<A here string in Racket 
This string spans for multiple lines
and can contain any Unicode symbol.
So things like λ, , α, β, are all fine.

In the next line comes the terminator. It can contain any Unicode symbol as well, even spaces and smileys!
A here string in Racket 
 )

输出:

This string spans for multiple lines
and can contain any Unicode symbol.
So things like λ, ☠, α, β, are all fine.

In the next line comes the terminator. It can contain any Unicode symbol as well, even spaces and smileys!

here字符串可以像一般的字符串一样使用:

#lang racket

(printf #<<END
Dear ~a,

Thanks for the insightful conversation ~a.

                ~a

END
        "Isaac"
        "yesterday"
        "Carl")

输出:

Dear Isaac,

Thanks for the insightful conversation yesterday.

                Carl

一个有趣的替代方案是使用语言的扩展at-exp来写@-表达式[9]

它们看起来像这样:

#lang at-exp racket

(displayln @string-append{
  This is a long string,
  very convenient when a
  long chunk of text is
  needed.
  
  No worries about escaping
  "quotes". It's also okay
  to have λ, γ, θ, ...
  
  Embed code: @|(number->string (+ 3 4))|
  })

输出:

This is a long string,
very convenient when a
long chunk of text is
needed.

No worries about escaping
"quotes". It's also okay
to have λ, γ, θ, ...

 Embed code: 7


Ruby

下列Ruby代码用here文档显示了一个列表:

puts <<GROCERY_LIST
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*
 
* Organic
GROCERY_LIST

[10]

输出:

Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*

* Organic

写入文件:

File::open("grocery-list", "w") do |f|
  f << <<GROCERY_LIST
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*
 
* Organic
GROCERY_LIST
end

Ruby也允许标识符不起始于行首,需要以<<-起始here文档。

另外,Ruby对待here文档就像一个双引号括起来的字符串,即可以使用#{}来进行代码替换。

以下例子展示了这2个特性:

now = Time.now
puts <<-EOF
  It's #{now.hour} o'clock John, where are your kids?
  EOF

但是如果标识符是用单引号引起来的,则当做单引号内的字符串对待[10]

类似于Perl,Ruby允许在一行内开始多个here文档[10]

puts <<BEGIN + "<--- middle --->\n" + <<END
This is the beginning:
BEGIN
And now it is over!
END

# 以上相等于:
puts "This is the beginning:\n<--- middle --->\nAnd now it is over!"

Tcl

Tcl没有为here文档设立特殊的语法,因为一般的字符串语法已经允许嵌入换行和保持缩进。用括号括起来的字符串,没有扩展:

puts {
Grocery list
------------
1. Salad mix.
2. Strawberries.*
3. Cereal.
4. Milk.*
 
* Organic
}

用引号括起来的字符串在运行时执行替换:

set sender "Buffy the Vampire Slayer"
set recipient "Spike"

puts "
Dear $recipient, 

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender
"

在括号包裹的字符串里,起始括号和终止括号数量应该一样多。在引号包裹的字符串里,括号可以不一样多,但是反斜杠,美元符号和括号都会被替换,此时第一个没有被转义的双引号会结束字符串。

需要注意的一点是:上边的字符串的第一个和最后一个字符都是换行。string trim可以用来删除头尾空行:

puts [string trim "
Dear $recipient, 

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender
" \n]

其它

微软 NMAKE

在微软NMAKE里,here文档是行内的文件。行内文件以<<<<文件名开始[11]。第一种方法创建一个临时文件。第二种创建(或覆盖)特定文件。所有的行内文件都终止于独自成行的<<,后边可以添加不区分大小写的KEEPNOKEEP来决定该文件是否保留。两个都不添加和加入NOKEEP效果一样[12]

target0: dependent0
    command0 <<
临时行内文件
...
<<

target1: dependent1
    command1 <<
临时行内文件,但保留
...
<<KEEP

target2: dependent2
    command2 <<filename2
专有行内文件,但用完后删除
...
<<NOKEEP

target3: dependent3
    command3 <<filename3
专有行内文件
...
<<KEEP

参见

参考

  1. ^ Bash Shell 的 HERE 文档 (cat << EOF). [2012-07-16]. (原始内容存档于2012-05-03). 
  2. ^ unix系统下here文档的详解. [2012-07-16]. (原始内容存档于2016-03-10). 
  3. ^ Q. What is a here-string in Windows PowerShell?. [2012-07-16]. (原始内容存档于2013-04-28). 
  4. ^ Variable expansion in strings and here-strings - Windows PowerShell Blog. [2012-07-16]. (原始内容存档于2012-06-30). 
  5. ^ Perl operators and precedence. [2012-07-16]. (原始内容存档于2012-07-17). 
  6. ^ Heredoc in PHP manual. [2012-07-16]. (原始内容存档于2012-07-12). 
  7. ^ PHP: Strings - Manual. [2012-07-16]. (原始内容存档于2012-07-03). 
  8. ^ Here string in Racket Documentation. [2012-07-16]. (原始内容存档于2011-09-03). 
  9. ^ @ Syntax in Racket Documentation. [2012-07-16]. (原始内容存档于2012-01-22). 
  10. ^ 10.0 10.1 10.2 Ruby's here document mini tutorial.. [2012-07-16]. (原始内容存档于2012-07-12). 
  11. ^ Specifying an Inline File. [2012-07-16]. (原始内容存档于2019-10-17). 
  12. ^ Creating Inline File Text. [2012-07-16]. (原始内容存档于2016-05-17). 

外部链接