維基百科:個人資訊
此頁面目前處於閒置狀態,僅供歷史參考而保留。 此頁面最後更新於2022年6月25日 (六) 12:06 (UTC)。此頁面的內容可能已無明確的共識支持,或是不再與討論的主題相關。若您希望重啟討論,請至互助客棧尋求更廣泛的意見。 |
個人資訊已不再使用(見討論頁) |
個人資訊為元數據的特殊設定,只能被加在傳記類條目中。它含有一個人的相關資訊(名字、名稱縮寫、生卒年月日,以及出生地等)。這樣的標記可以讓原維基能夠快速地擷取一個人的標準化個人資料,並將人物進行分類以及統計分析。
加上{{Persondata}}模版的自傳條目並不會影響一個條目的正常運作,也不會顯現於條目中。
截至2014年7月[update],英文維基百科已經有1,157,000篇條目加註上此模板,德文維基百科則有537,000篇條目。
The WikiProject that works to improve the usage of Persondata, WikiProject Persondata, is seeking contributors.
目的
Without uniform formatting, such as the (born .... died ...) parentheses, it is very difficult to extract useful information from biographical articles automatically. It is also hard to alphabetize all the biographical articles automatically, since the titles typically begin with the person's first name (although we have DEFAULTSORT for that). By adding standardized metadata to such articles, we can facilitate the creation of new applications for Wikipedia content, such as Wikipedia CD-ROMs, custom search applications, etc. Hopefully, this will be the first of many steps towards enriching Wikipedia with semantic content.
用途
維基數據
維基數據作為一個元數據系統還在發展中,最終,個人資訊信息將被移動到維基數據並且個人資訊會被廢棄。然而,現在很多用戶依然在添加和維護個人資訊。我們鼓勵用戶與維基數據共同提升。這包括創建和維護維基數據個人頁面。as well as working on the mechanics of wikidata itself.
這個模板還不能從維基數據獲取信息:
Parameter | Type | Property ID |
---|---|---|
NAME | Firstname Lastname | Label, e.g. Michel Velleman (Q151605) |
ALTERNATIVE NAMES | Other Name1, Othername2 | <?> |
SHORT DESCRIPTION | Claim to fame | <?> |
DATE OF BIRTH | day or year of birth | p569 |
PLACE OF BIRTH | birthplace | p19 |
DATE OF DEATH | day or year of death | p570 |
PLACE OF DEATH | deathplace | p20 |
瀏覽個人資料
By default, persondata is invisible to normal users. To make persondata visible, you must:
- Either install this JavaScript in Special:Mypage/skin.js, which will add a button to the top button bar of every page allowing you to easily show and hide persondata boxes;
- Or edit your user stylesheet as explained below, causing persondatas to be always visible;
- Or even do both, as one method doesn't interfere with the other and the above JavaScript has useful persondata-editing features.
To make persondatas permanently visible, first make sure you are logged in. Then edit (or create) a page at Special:Mypage/skin.css and add the following line:
table.persondata {display:table !important;}
or, if you use Microsoft Internet Explorer 7 or earlier:
table.persondata {display:block !important;}
Tip: After saving the CSS, you must empty the browser cache to see the changes: Mozilla/Firefox (Windows): Ctrl-Shift-R; Mozilla/Firefox/Camino (Mac): Cmd-Shift-R; Internet Explorer (Windows): Ctrl-F5; Opera (all): F5; Safari (Mac): Cmd-R; Konqueror (Linux): Ctrl-R. Some Firefox (Linux) users report that both lines must be present in their monobook.css (though they can probably be simplified to table.persondata {display:block table !important;}
), and users who switch between browsers and platforms may need to do likewise.
If you can see a block with data about Ferdinand Magellan between this paragraph and the next, you have successfully made persondata visible: Template:Persondata Otherwise this paragraph will follow directly below the previous one.
To make the persondata box invisible again, simply remove the CSS line provided above from your user stylesheet.
Warning: Since persondatas are by default invisible, editors rarely plan for them when designing the layout of an article, which means that making them visible might cause some article footers to look strange for you. For the same reason, if you have persondatas visible while editing and previewing, remember that most people don't, so planning the layout to accommodate it might cause them to find a strange-looking article footer. Thus, take care to edit from the perspective of the majority. It is best to follow the persondata placement advice given on this page.
使用模板
Position
To use the {{Persondata}} template, copy the wikitext below to the end of a biographical article and fill in the parameters manually, use this javascript or you can use AWB which can add the template and fill in the information semi-automatically from infoboxes. If you add the template manually, place it just before the categories. {{DEFAULTSORT:Sort key}} is not a real template but a direct part of categorization, and therefore should be located between persondata and categories. The same applies to the {{Lifetime}} template, which implements DEFAULTSORT.
{{Persondata | NAME = | ALTERNATIVE NAMES = | SHORT DESCRIPTION = | DATE OF BIRTH = | PLACE OF BIRTH = | DATE OF DEATH = | PLACE OF DEATH = }}
Next, fill out the data fields. Make sure the name is entered with the surname first (the same way you would with a category listing). Do not delete empty data fields, for example, if a person is still alive, you'll leave the date and place of death blank. Here is an example of a properly filled out template:
{{Persondata | NAME = Magellan, Ferdinand | ALTERNATIVE NAMES = Magalhães, Fernão de (Portuguese); Magallanes, Fernando de (Spanish) | SHORT DESCRIPTION = Sea explorer | DATE OF BIRTH = Early 1480 | PLACE OF BIRTH = Sabrosa, Portugal | DATE OF DEATH = 27 April 1521 | PLACE OF DEATH = Mactan Island, Cebu, Philippines }}
Parameters
The parameters NAME
, ALTERNATIVE NAMES
, SHORT DESCRIPTION
, DATE OF BIRTH
, PLACE OF BIRTH
, DATE OF DEATH
, and PLACE OF DEATH
are used to construct a persondata record. These fields can possibly be extended in the future, and currently it isn't necessary to provide wikilinks in them; however, these might be useful in some future application, so feel free to add them to locations if you wish.
Please follow these general guidelines when filling these fields:
Name and titles
The person's most commonly known name should be in the |NAME=
field, in the following format: Family Name, Given Name Middle Names, title. For most cases this will be straightforward. For example, "George Walker Bush" becomes "Bush, George Walker". Family-name-first names, common in Asia, do not take a comma: "Ho Chi Minh" is specified in this parameter as "Ho Chi Minh". In some cases, however, there may be ambiguity about a person's surname. When in doubt, format the name according to how you would expect it to be alphabetized. For example, Ludwig van Beethoven would be alphabetized under "Beethoven", while Townes Van Zandt would be alphabetized under "Van Zandt". Also please note that some multi-part family names (e.g. commonly in Spanish) may appear to be a middle name and family name to a native English speaker. If you are unsure, ask someone familiar with the subject how they would alphabetize the name or consult a cataloging guide such as the AACR2. Also check the Library of Congress at [1] or the German National Library (Deutsche Nationalbibliothek) at [2]. For European names with "van/Van", "del/Del", etc., the most common Continental European practice is to alphabetize by the significant part of the name (e.g. "Zandt", "Toro"), while the typical UK, North American, Australian and New Zealand practice is to alphabetize by the entire surname (e.g. "Van Zandt", "del Toro"). Names that do not include a family name should be given as-is, e.g. "Brutus of Troy", not treated as if they had a family name, e.g. not "Troy, Brutus of".
It is usually a good idea to list as much of a person's name as possible in the name field to avoid confusion with similar names. Unless it is part of a title of nobility, do not include |TITLE=
; i.e., do not include honorifics such as "Dr.", "Professor", or "PhD".
Articles which do not have a |NAME=
field, or leave the field blank, will be added to Category:Persondata templates without name parameter. Please follow these guidelines when working on articles in this category.
Alternative names
The optional |ALTERNATIVE NAMES=
field is used to list other common (usually international) forms of the person name, but not simply abbreviated versions of the full name. It follows a similar pattern to the |NAME=
field, but with added information, adding as many semicolon-separated names as needed to fully identify the person:
|ALTERNATIVE NAMES=Alternative Name1 (language); Alternative Name2 (language); Alternative Name3 (pen/stage/etc. name)
An alternative name in another language should not be added unless the subject has a particular connection with that language (e.g., the Japanese romaji for Oprah Winfrey is not important metadata for her article on the English-language Wikipedia, while the Italian and Spanish for Christopher Columbus definitely are, since "Christopher Columbus" is simply an English translation of them).
Suggested optional attributes in parentheses (adapted from the German):
Attribute | Description |
---|---|
pseudonym | 別名,如一個人的筆名或藝名等等。 |
stage name | This name is a stage name. In contrast to a pseudonym this is always publicly known. |
nickname | 綽號,指不是自己取的名字 |
real name | 真實名稱 |
birth name | 出生時的名字,有些人後來會改名,或嫁娶之後改名。 |
full name | 全名,如甘迺迪的全名為Kennedy, John Fitzgerald,但最常用的名字為Kennedy, John F.。 |
Short description
A small description of the person. Try to be concise but informative enough so that anyone reading this entry in a table will roughly know who the person is/was or does/did. The first letter of this should be capitalized, but non-proper nouns should not otherwise be capitalized, e.g. |SHORT DESCRIPTION=Rock musician
, not |SHORT DESCRIPTION=Rock Musician
. If the article name is disambiguated, or a disambiguation redirects to it, the disambiguation term/phrase is usually a good selection for this parameter; e.g., |SHORT DESCRIPTION=Cyclist
and |SHORT DESCRIPTION=Billiards player
are good bets for Eddy Merckx (cyclist) and Eddy Merckx (billiards player), respectively. An exception is when the disambiguation is a description of a field instead of an occupation/role (e.g. Connie Mack (baseball), whose article details suggest |SHORT DESCRIPTION=Baseball manager and team owner
).
Articles which do not have a |SHORT DESCRIPTION=
field, or leave the field blank, will be added to Category:Persondata templates without short description parameter. Please follow these guidelines when working on articles in this category.
Dates of birth and death
Follow the Manual of Style guidelines on whether to use D Month YYYY format or the Month D, YYYY style when filling the |DATE OF BIRTH=
and |DATE OF DEATH=
fields. Do not link the date or numbers. YYYY-MM-DD is also acceptable, provided the Gregorian calendar is used and the year is at least 1583.
Do not use templates within these fields, as they can interfere with data extraction. Abraham Lincoln's birthday, for example, should be listed as February 12, 1809
and not as {{birth date|1809|2|12|mf=y}}
.
Places of birth and death
Be specific, but not to the point of listing a street address. Usual formats are City/Village, State/Province, Country; or City/Village, country; or State/Province, Country; etc. Again avoid using templates in these fields. Although you can link to locations if you wish, avoid using piped links as they clutter the field.
範例
Fieldname | Examples |
---|---|
NAME |
Magellan, Ferdinand |
ALTERNATIVE NAMES |
Magalhães, Fernão de (Portuguese); Magallanes, Fernando de (Spanish) |
SHORT DESCRIPTION |
Sea explorer |
DATE OF BIRTH |
1480 |
PLACE OF BIRTH |
Sabrosa, Portugal |
DATE OF DEATH |
27 April 1521 |
PLACE OF DEATH |
Mactan Island, Cebu, Philippines |
擷取個人資料
With project Templatetiger
With project Templatetiger it is possible to view and output the data with:
From an SQL database
此維基百科頁面需要更新。 (2012年2月1日) |
Using an SQL query, the persondata can be filtered from Wikipedia articles stored in a database. As an example, here is an SQL query that can be used to extract persondata:
SELECT pages.cur_namespace, pages.cur_title, SUBSTRING(SUBSTRING(pages.cur_text FROM INSTR(pages.cur_text,'{{Persondata')), 1, INSTR(SUBSTRING(pages.cur_text FROM INSTR(pages.cur_text,'{{Persondata')),'}}')+1) AS 'Persondata' FROM cur AS pd JOIN templatelinks AS tl ON pd.cur_namespace = tl.tl_namespace AND pd.cur_title = tl.tl_title JOIN cur AS pages ON tl.tl_from = pages.cur_id AND pages.cur_namespace = 0 WHERE pd.cur_namespace = 10 AND pd.cur_title = 'Persondata'
In order to be useful, however, the persondata must be further divided into individual data fields.
From the XML dump
Persondata can also be extracted from the regular Wikipedia database dumps. The following procedure has been adapted from scripts written to do this for the German Wikipedia by de:User:JakobVoss (who is also User:Nichtich). This is described (in German) at de:Hilfe:Personendaten/Datenextraktion. The process consists of four stages: downloading the database dump, extracting the persondata, parsing the persondata, and optionally loading it into a MySQL database. (This is an example of an Extract, transform, load process). As a rough guide, downloading the database dump will take a few hours with a fast internet connection, extracting the persondata will take around an hour, parsing the persondata and loading it into a MySQL database each take a few seconds.
System requirements
The original scripts were written for Linux; however, they can also be run in Windows using either a Linux emulator, or by downloading Windows versions of the necessary software:
- java (Most Windows machines will already have this installed)
- bzip2 Download here
- perl Download here
In addition if you want to load the extracted persondata into a MySQL database you will need MySQL (Download here).
Downloading the database dump
Database dumps can be found at http://download.wikimedia.org/enwiki. The subdirectories are named after the date of the dump. The file needed for extracting persondata is named enwiki-date-pages-articles.xml.bz2, e.g. enwiki-20070908-pages-articles.xml.bz2. The latest version of this file can always be found at http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2. As of June 2012 this file is 8.0 GB in size. You may find it useful to use wget to download this.
Extracting persondata
Files needed:
- joost.jar (latest version available from http://joost.sourceforge.net/)
- addNamespaces.stx
- extractPersondata.stx
- pd2tab.stx
Bzip2 is used to uncompress the dump, and the output is passed to three piped STX-scripts to extract the information in the persondata templates. STX is implemented in the java archive joost.jar.
The syntax for calling these scripts is
bzip2 -dc enwiki-20070908-pages-articles.xml.bz2 | java -jar joost.jar - addNamespaces.stx extractPersondata.stx pd2tab.stx > 20070908-extract.tab
This can be typed in at the command prompt in Linux or in Windows (Start->Run->cmd). Alternatively in Windows it can be typed into a text file with the .bat extension (e.g. extract.bat) which can then be run by double-clicking on it. Note that in Windows you will need to add type bzip2.exe instead, and if bzip2 is not in the same directory as the database dump you will need to specify the full file path (e.g. C:\full\file\path\bzip2.exe).
This process outputs a running total of the number of articles found with persondata. It also outputs a running total of articles with Template:PND. This is a legacy of the original German scripts; it was easier not to remove it when adapting them. (A Personennamendatei number is assigned to all German-speaking authors, and can be used to link to the catalogue of the German National Library. Some 170,000 articles in the German Wikipedia use this template. A few hundred articles currently use it in the English Wikipedia.)
The output of this step is a tab-separated file (20070908-extract.tab in the above example) which contains the information from the persondata template.
Parsing persondata
File needed:
The information entered in the fields of the persondata template can take many forms, especially the dates. For many applications it is useful to have such information in standardised form. The Perl script transform.pl takes the XXXX-extract.tab file from the previous step and parses the fields to obtain quantities such as day, month, year, decade, century for the dates, given name and surname for names of the form Smith, John, article name where the first place in the birth/death place field is a wikilink, etc.
The syntax for this step is
transform.pl 20070908-extract.tab > 20070908-full.tab
This produces another tab-separated file. If desired this can be loaded into a spreadsheet and certain basic information obtained, by either sorting the columns or searching for appropriate terms, however more complicated analysis is more easily done using a database.
將個人資料匯入資料庫
需要軟件:
如果你有下載 MySQL ,那妳可以利用 table.sql 建立一個平台叫做 pub_pd_en。並將資料以 XXXX-full.tab 匯入。(You will need to change the filename at the end of table.sql).
Within MySQL the syntax to run this is
source C:/full/file/path/table.sql;
Linux scripts
In the original implementation on the German Wikipedia the whole process from extracting data to loading it into a database was performed by a single shell script, etl, which in turn called scripts extract.pl, transform.pl and load.pl. If you wish to use these they can be found at http://toolserver.org/~voj/pd/staging-area/. In addition to the modified files listed under the previous steps, some minor modifications to extract.pl and load.pl are necessary to use these for the English case, e.g. replacing de with en, and extractPersonendaten with extractPersondata in extract.pl, and using your own username in load.pl. The modified version of transform.pl given under Parsing persondata above should of course be used as well.
Files:
Tools for filling out persondata
參見
- Template and corresponding example in Semantic Mediawiki (SMW); note that in SMW either the whole data field is a single link (relation), or the data field is not linked at all (attribute).
- hCard – a microformat with similar properties.
- hCard in Wikipedia
- Comparison of PERSONDATA and hCard (no longer maintained – Death-date ("dday") is a new property, in the next draft of the vCard specification, along with places of birth and death)
- WikiProject Persondata
- Biographies of living persons