別名 (計算)
此條目目前正依照其他維基百科上的內容進行翻譯。 (2023年4月15日) |
別名(Aliasing)是指內存中的一個數據位置可以通過程序中的多個名稱來訪問。通過某一個名稱修改數據,其他別名關聯的值也會改變,這是程式設計師可能不會預期到的。別名的存在使得程式的理解、分析及優化程序變得困難。別名分析可以分析處理程序中有關別名的信息。
例子
緩衝區溢出
大部份C語言的實現都不會有陣列索引的邊界檢查。因此,可以利用此一漏洞,寫入在陣列範圍外的資料(緩衝區溢出),根據C語言的標準,這是未定義行為,但在大部份沒有陣列索引邊界檢查的C語言中,會出現上述的別名效果,用某一個名稱更改資料,而對應別名的數值隨之變化。
若陣列是在呼叫堆疊中產生,而有變數恰好就在陣列位置的前後,寫入陣列索引範圍外的元素,可能就會改到該變數。例如,假設有二個元素的int陣列(其名稱為arr),後面是一個int變數(名稱是i),若arr[2](陣列的第三個元素)位置和i相同,這二個變數就互為別名。
# include <stdio.h>
int main()
{
int arr[2] = { 1, 2 };
int i=10;
/* Write beyond the end of arr. Undefined behaviour in standard C, will write to i in some implementations. */
arr[2] = 20;
printf("element 0: %d \t", arr[0]); // outputs 1
printf("element 1: %d \t", arr[1]); // outputs 2
printf("element 2: %d \t", arr[2]); // outputs 20, if aliasing occurred
printf("i: %d \t\t", i); // might also output 20, not 10, because of aliasing, but the compiler might have i stored in a register and print 10
/* arr size is still 2. */
printf("arr size: %d \n", (sizeof(arr) / sizeof(int)));
}
在一些C語言的實現中,有可能會出現上述的結果,因為這些實現會為陣列安排一塊連續的記憶體,而陣列元素就是用陣列位置再位移陣列索引值乘以陣列元素大小,再進行間接定址。C語言沒有邊界檢查,因此陣列的存取可能會超過陣列範圍。上述的別名效果其實屬於未定義行為,有些實現方式會不會讓堆疊中的變數緊鄰陣列,例如,依其處理器的字長度有對齊功能等。C語言標準沒有特別說明資料在記憶體中擺放的方式(ISO/IEC 9899:1999, section 6.2.6.1)。
若C語言編輯器在存取陣列範圍以外的位置時,沒有別名效果,這也是可以的。
別名指針
另一種程式語言中會出現的別名,是指用不同的變數(例如指標)參考同一個位置的記憶體。例如XOR交換演算法,其引數是二個指標,函式會假設二個指標指向不同的位置。若二個指標的位置相同(或互為別名),程式可能會出現錯誤。對於接受指標作為引數的函式來說,這是常見的問題,是否允許二個指標互為別名,需要明確的說明,特別是在會在指標指向記憶區塊,進行複雜處理的函式。
Specified aliasing
Controlled aliasing behaviour may be desirable in some cases (that is, aliasing behaviour that is specified, unlike that enabled by memory layout in C). It is common practice in Fortran. The Perl programming language specifies, in some constructs, aliasing behaviour, such as in foreach loops. This allows certain data structures to be modified directly with less code. For example,
my @array = (1, 2, 3);
foreach my $element (@array) {
# Increment $element, thus automatically
# modifying @array, since $element is ''aliased''
# to each of @array's elements in turn.
$element++;
}
print "@array \n";
will print out "2 3 4" as a result. If one wanted to bypass aliasing effects, one could copy the contents of the index variable into another and change the copy.
優化時衝突
優化編譯器在存在指針時往往對變量做出保守假設。如常量傳播能否使用。代碼重排序(code reordering)也受別名的影響,這可能會改善指令調度或允許更多的循環優化.
C語言的C99標準,提出了嚴格別名規則(strict aliasing rule)見section 6.5, paragraph 7。指出使用不同類型的指針訪問同一內存位置是違規的。編譯器因而可以假定不同類型的指針不會是別名,這可能帶來性能的巨大提升。[1]一些著名項目,如Python 2違反了此規則。[2]Linux內核也解決了類似問題。[3] 使用gcc編譯選項-fno-strict-aliasing
可關閉此規則。
- 對象的動態類型
- cv量化版本
- signed或unsigned版本
- 聚合類型(如struct、class)或union類型,包含此前所指的類型作為它的元素,或非靜態數據成員(包括遞歸嵌套類型)
- 動態類型的基類型
- char或unsigned
硬體別名
The term aliasing is also used to describe the situation where, due to either a hardware design choice or a hardware failure, one or more of the available address bits is not used in the memory selection process.[4] This may be a design decision if there are more address bits available than are necessary to support the installed memory device(s). In a failure, one or more address bits may be shorted together, or may be forced to ground (logic 0) or the supply voltage (logic 1).
- Example
For this example, we assume a memory design with 8 locations, requiring only 3 address lines (or bits) since 23 = 8). Address bits (named A2 through A0) are decoded to select unique memory locations as follows, in standard binary counter fashion:
A2 | A1 | A0 | Memory location |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
1 | 0 | 0 | 4 |
1 | 0 | 1 | 5 |
1 | 1 | 0 | 6 |
1 | 1 | 1 | 7 |
In the table above, each of the 8 unique combinations of address bits selects a different memory location. However, if one address bit (say A2) were to be shorted to ground, the table would be modified as follows:
A2 | A1 | A0 | Memory location |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
In this case, with A2 always being zero, the first four memory locations are duplicated and appear again as the second four. Memory locations 4 through 7 have become inaccessible.
If this change occurred to a different address bit, the decoding results would be different, but in general the effect would be the same: the loss of a single address bit cuts the available memory space in half, with resulting duplication (aliasing) of the remaining space.
參見
參考文獻
- ^ Mike Acton. Understanding Strict Aliasing. 2006-06-01 [2017-11-20]. (原始內容存檔於2013-05-08).
- ^ Neil Schemenauer. ANSI strict aliasing and Python. 2003-07-17 [2017-11-20]. (原始內容存檔於2020-06-05).
- ^ Linus Torvalds. Re: Invalid compilation without -fno-strict-aliasing. 2003-02-26 [2017-11-20]. (原始內容存檔於2020-11-12).
- ^ Michael Barr. Software Based Memory Testing. 2012-07-27 [2017-11-20]. (原始內容存檔於2020-11-29).
外部連結
- Understanding Strict Aliasing(頁面存檔備份,存於網際網路檔案館) – article by Mike Acton
- Aliasing, pointer casts and gcc 3.3 (頁面存檔備份,存於網際網路檔案館) – informational article on NetBSD mailing list
- Type-based alias analysis in C++ (頁面存檔備份,存於網際網路檔案館) – Informational article on type-based alias analysis in C++
- Understand C/C++ Strict Aliasing (頁面存檔備份,存於網際網路檔案館) – article on strict aliasing originally from the boost developer's wiki