¥½¥Õ¥Æ¥Ã¥¯¡¦¥È¥Ã¥×¥Ú¡¼¥¸¤Ø
¥Û¡¼¥à À½ÉÊ ¥»¥­¥å¥ê¥Æ¥£¡¦¥µ¡¼¥Ó¥¹ HPC¥µ¡¼¥Ó¥¹ ¥À¥¦¥ó¥í¡¼¥É ´ë¶È¾ðÊó

PGI compiler TIPS
PGI Unified BinaryTM ¤È¤Ï
PGI TIPS¾ðÊó > AMD64/EM64T Unified Binary

¡¡Á´¤Æ¤Î¥³¥ó¥Ñ¥¤¥ë¡¦¥ª¥×¥·¥ç¥ó

AMD64 CPU ¤È EM64T CPU ´Ö¤Î PGI Unified BinaryTM ¤ÎÀ­Ç½

PGI Unified Binary ¤È¤Ï¡¢AMD64 ¤È¥¤¥ó¥Æ¥ë64(EM64T) ¤ÎξÊý¤Î¥×¥í¥»¥Ã¥µ¤ËºÇŬ²½¤µ¤ì¤¿¥³¡¼¥ÉŽ¥¥·¡¼¥±¥ó¥¹¤òñ°ì¤Î¼Â¹Ô²Äǽ¥Õ¥¡¥¤¥ë¤Ë¤·¤¿¤â¤Î¤Ç¤¹¡£PGI Unified Binary ¤òÍøÍѤ¹¤ì¤Ð¡¢ÆÈΩ·Ï¥½¥Õ¥È¥¦¥§¥¢Ž¥¥Ù¥ó¥À¡¼¡ÊISV¡Ë¤ä¥«¥¹¥¿¥àŽ¥¥¢¥×¥ê¥±¡¼¥·¥ç¥ó³«È¯¼Ô¤Ï¡¢Intel ¼Ò¤ÈAMD ¼Ò¤Î¤¤¤º¤ì¤«¤Î x64 ¥×¥é¥Ã¥È¥Õ¥©¡¼¥à¾å¤Çξ¼Ò¤ÎºÇ¿·¥Þ¥¤¥¯¥í¥×¥í¥»¥Ã¥µŽ¥¥Æ¥¯¥Î¥í¥¸¡¼¤ËÂбþ¤¹¤ë¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£¡Ê2006ǯ1·î11Æü¡¡½é¹Æ¡Ë

¥¤¥ó¥Æ¥ë¼Ò¤ÎEM64TÂбþ¥×¥í¥»¥Ã¥µ¤Ï¡¢AMD ¼Ò¤Î AMD64 (Opteron, Turion64, Athlon64) ¥×¥í¥»¥Ã¥µ¤È¡¢AMD64-ABI ¤Ë¤è¤Ã¤ÆCPU ¥¤¥ó¥¹¥È¥é¥¯¥·¥ç¥ó¤È¸À¤¦´ÑÅÀ¤Ç¤Ï¸ß´¹À­¤¬¤¢¤ê¤Þ¤¹¤¬¡¢³Æ¥×¥í¥»¥Ã¥µ¤Î¥Þ¥¤¥¯¥í¡¦¥¢¡¼¥­¥Æ¥¯¥Á¥ã¾å¤Ë¤ª¤±¤ë°ã¤¤¤Ë¤è¤ê¡¢¤½¤ÎºÇŬ²½¼êË¡¡ÊÆÃ¤Ë¡¢¥á¥â¥ê¥¢¥¯¥»¥¹¼þ¤ê¡Ë¤¬°Û¤Ê¤ê¡¢¤½¤ÎCPUÆÃÀ­¤Ë¨¤·¤¿ºÇŬ²½¤ò¹Ô¤ï¤Ê¤±¤ì¤Ð¡¢À­Ç½¤¬Äã²¼¤¹¤ë¤³¤È¤¬¤¢¤ê¤Þ¤¹¡£PGI ¥³¥ó¥Ñ¥¤¥é¤Ï¡¢AMD64ʤӤËEM64T¥×¥í¥»¥Ã¥µ¤Î¤½¤ì¤¾¤ì¤Î¥Þ¥¤¥¯¥í¥¢¡¼¥­¥Æ¥¯¥Á¥ã¤Ë±þ¤¸¤¿ºÇŬ²½¤¬¹Ô¤¨¤ë¾¦ÍÑ¥³¥ó¥Ñ¥¤¥é¤Ç¤¹¡£¾¦ÍÑ¥³¥ó¥Ñ¥¤¥é¤Ï¡¢¸ß´¹À­¤À¤±¤Ç¤Ï¤Ê¤¯¤½¤Î¥×¥í¥»¥Ã¥µ¤Ë±þ¤¸¤¿À­Ç½¤âºÇŬ²½¤¹¤ë¤³¤È¤¬É¬ÍפǤ¹¤¬¡¢Â¾¤Î¿¤¯¤Î¥³¥ó¥Ñ¥¤¥é¤Ï¡¢¤½¤ÎÅÀ¤òÌÀ³Î¤Ëëð¤Ã¤Æ¤¤¤Þ¤»¤ó¡£¥×¥í¥»¥Ã¥µ¤Ë±þ¤¸¤¿¿¿¤ÎÀ­Ç½¤òµý¼õ¤¹¤ë¤Ë¤Ï¡¢AMD64·Á¼°¤Î¼Â¹Ô¥Ð¥¤¥Ê¥ê¤È¸À¤¦¤À¤±¤Ç¤Ï¤Ê¤¯¡¢¤³¤ì¤¬ EM64T ÍѤ«¡¢AMD64 ÍѤ«¤ò¶èÊ̤·¤Æ»ÈÍѤ¹¤ë¤³¤È¤¬É¬ÍפǤ¹¡£¤·¤«¤·¡¢¤³¤ì¤Ï¡¢ISV ¥Ù¥ó¥À¡¼¤Î³«È¯¼Ô¡¢¤½¤ì¤òÍøÍѤ¹¤ë¥æ¡¼¥¶¤Ë¤È¤Ã¤ÆÂ礭¤ÊÉéô¤È¤Ê¤ë¤³¤È¤ÏÌÀ¤é¤«¤Ç¤¹¡£¤³¤ÎÌäÂê¤ò²ò·è¤¹¤ë¤¿¤á¤Ë¡¢2006ǯ1·î¤Ë¥ê¥ê¡¼¥¹¤·¤¿¡¢PGI ¥Ð¡¼¥¸¥ç¥ó 6.1 ¤è¤ê¡¢AMD64 ¤È EM64T ´Ö¤Î¡ÖUnified Bynary support¡×¤ò¶È³¦¤Ç½é¤á¤Æ¼Â¸½¤·¤Þ¤·¤¿¡£PGI Unified BinaryTM ¤È¤Ï¡¢¤É¤Á¤é¤Î¥×¥é¥Ã¥È¥Õ¥©¡¼¥à¤ËÂФ·¤Æ¤âÀ­Ç½¥Ú¥Ê¥ë¥Æ¥£¤¬¤Ê¤¤·Á¤ÇºÇŬ²½¤ò¹Ô¤¤¡¢¡Öñ°ì¤Î¼Â¹Ô¥â¥¸¥å¡¼¥ë¡×¤òÀ¸À®¤¹¤ëµ¡Ç½¤Ç¤¹¡£¥×¥í¥°¥é¥à³«È¯¤Ë¤ª¤¤¤Æ¤Ï¡¢¥×¥í¥°¥é¥à¤ÎÀ­Ç½¤¬¤É¤Á¤é¤Î¥×¥é¥Ã¥È¥Õ¥©¡¼¥à¤Ë¤âÆ©²áŪ¤Ë°Ý»ý¤Ç¤­¤ë¤³¤È¤Ç¡¢ºÇŬ²½¤Î¤¿¤á¤ËÈñ¤ä¤¹¥³¥¹¥È¤¬·Ú¸º¤µ¤ì¤Þ¤¹¡£
¤³¤³¤Ç¤Ï¡¢PGI 6.1 ¥³¥ó¥Ñ¥¤¥é¤òÍѤ¤¤Æ¡¢ËܼÁŪ¤Ë¥×¥í¥»¥Ã¥µ¡¦¥Þ¥¤¥¯¥í¥¢¡¼¥­¥Æ¥¯¥Á¥ã¤Î°Û¤Ê¤ë AMD64 ÍѥХ¤¥Ê¥ê¤È EM64T ¥Ð¥¤¥Ê¥ê¤ÎÀ­Ç½¤¬°Û¤Ê¤ë¤³¤È¤ò¼Â¾Ú¤·¡¢¤É¤Á¤é¤Î¥×¥í¥»¥Ã¥µ¤ËÂФ·¤Æ¤âºÇŬ¤ÊÀ­Ç½¤òÄ󶡤Ǥ­¤ë PGI Unified Binary ¤ÎÀ­Ç½¤ò¼¨¤·¤Þ¤¹¡£

PGI 6.0 °ÊÁ°¤Î¥³¥ó¥Ñ¥¤¥é¤Ç¤Î AMD64/EM64T ¤ÎÀ­Ç½¤Ë´Ø¤¹¤ëµ­»ö¤Ï¡¢¤³¤Á¤é¤Ø


¡¡¤´»²¹Í¡¡¡§¡¡Intel(R) ¥Ç¥å¥¢¥ëPentium(R) D¥×¥í¥»¥Ã¥µ¡¦¥·¥¹¥Æ¥à¤Ç¤âPGI¥³¥ó¥Ñ¥¤¥é¤Î¹â®À­¤¬¼Â¾Ú¤µ¤ì¤ë !!
¡¡ ¡¡¡¡¡¡¡¡ ¡§¡¡AMD ¥Ç¥å¥¢¥ë Athlon64 X2 ¥×¥í¥»¥Ã¥µ¡¦¥·¥¹¥Æ¥à¤òPGI ¥³¥ó¥Ñ¥¤¥é¤Çɾ²Á¤¹¤ë¡ª


PGI 6.1 ¤Î¥¯¥í¥¹¥³¥ó¥Ñ¥¤¥ëµ¡Ç½¤Ç¸¡¾Ú


¥³¥ó¥Ñ¥¤¥é¤ÎºÇŬ²½¤Ï¡¢¥Ç¥Õ¥©¥ë¥È¤Ç¤Ï¡¢¥³¥ó¥Ñ¥¤¥ë¤ò¹Ô¤¦¥·¥¹¥Æ¥à¤Î CPU ¥¿¥¤¥×¤Ë±þ¤¸¤¿ºÇŬ²½¤ò¹Ô¤¤¤Þ¤¹¡£Î㤨¤Ð¡¢Opteron¡ÊAMD64)¡¡¾å¤Ç¡¢¥³¥ó¥Ñ¥¤¥ë¤µ¤ì¤¿¥â¥¸¥å¡¼¥ë¤Ï¡¢ AMD64 CPU ÍѤ˺ÇŬ²½¤µ¤ì¤Æ¤¤¤Þ¤¹¡£¤³¤Î¥â¥¸¥å¡¼¥ë¤ò AMD64 ¤Ë¸ß´¹À­¤Î¤¢¤ë¥¤¥ó¥Æ¥ë¼Ò¤Î¡¡EM64T CPU¡ÊPentium,¡¡Xeon) ¾å¤Ç¼Â¹Ô¤·¤¿¾ì¹ç¡¢À­Ç½¤ËÍ¿¤¨¤ë±Æ¶Á¤¬¤É¤ÎÄøÅÙ¤¢¤ë¤«¤ò¸«¤ë¤³¤È¤Ç¡¢CPU ¤Ë°Í¸¤¹¤ëºÇŬ²½¤¬¹Ô¤ï¤ì¤Æ¤¤¤ë¤«¤É¤¦¤«¤òÍý²ò¤Ç¤­¤Þ¤¹¡£¤Þ¤¿¡¢¤½¤ÎµÕ¤Î¥Ñ¥¿¡¼¥ó¤âƱÍͤǤ¹¡£
°Ê²¼¤Ë¼¨¤¹Îã¤Ï¡¢¤³¤Î¤è¤¦¤Ê¾ì¹ç¤ÎÀ­Ç½¤Î°ã¤¤¤È Unified Binary ¤ÎÀ­Ç½¤òÍý²ò¤¤¤¿¤À¤¯¤¿¤á¤Î¤â¤Î¤Ç¤¹¡£¤³¤ì¤Ë¤è¤Ã¤Æ¡¢PGI ¥³¥ó¥Ñ¥¤¥é¤Ï³ÆCPU¤Ë±þ¤¸¤¿ºÇŬ²½¤ò¹Ô¤Ã¤Æ¤¤¤ë¤³¤È¤¬Íý²ò¤Ç¤­¤ë¤â¤Î¤È»×¤¤¤Þ¤¹¡£¤Þ¤º¡¢¥³¥ó¥Ñ¥¤¥ë¤ÎÊýË¡¤ò°Ê²¼¤Ë¼¨¤·¤Þ¤¹¡£

¡Ú-tp ¥ª¥×¥·¥ç¥ó¤òÉÕ¤±¤Ê¤¤¥Ç¥Õ¥©¥ë¥È»þ¡Û 

¡¡¡¡pgf95 -fastsse -Minfo xx.f¡Ê¥³¥ó¥Ñ¥¤¥ë»þ¤Ë»ÈÍѤ¹¤ë CPU ¤¬¥Ç¥Õ¥©¥ë¥È¤ÎºÇŬ²½¥¿¡¼¥²¥Ã¥È¤È¤Ê¤ë¡Ë

¡Ú°Û¤Ê¤ë CPU ÍѤ˺ÇŬ²½¤¹¤ë¥¯¥í¥¹¥³¥ó¥Ñ¥¤¥ë¤ÎÎã¡Û

   pgf95 -fastsse -Minfo -tp p7-64 xx.f (-tp ¥ª¥×¥·¥ç¥ó¤Ç ¥¤¥ó¥Æ¥ë EM64T ÍѤ˺ÇŬ²½¥³¡¼¥ÉÀ¸À®)

¡ÚPGI Unified Binary¥â¡¼¥É¤Î¼Â¹Ô¥â¥¸¥å¡¼¥ë¤òºîÀ®¤¹¤ëÎã¡Û

   pgf95 -fastsse -Minfo -tp x64 xx.f (-tp x64 ¤òÌÀ¼¨Åª¤Ë»ØÄꤹ¤ëɬÍפ¬¤¢¤ê¤Þ¤¹)
  • AMD64 (Opteron¡Ë¾å¤Ç¡¢-tp p7-64 ¤òÉղ䷤ƥ³¥ó¥Ñ¥¤¥ë¤¹¤ë¤È¡¢EM64T ÍѤ˺ÇŬ²½¤µ¤ì¤¿¥³¡¼¥É¤¬À¸À®¤µ¤ì¤Þ¤¹¡£µÕ¤Ë¡¢ Pentium(R) 4/Xeon(R) EM64T ¾å¤Ç¡¢-tp k8-64 ¤òÉղ䷤ƥ³¥ó¥Ñ¥¤¥ë¤¹¤ë¤È AMD64 ÍѤΥ³¡¼¥É¤¬¤Ç¤­¤Þ¤¹¡£
  • Unified Binary ¤òºîÀ®¤¹¤ë¤¿¤á¤Ë¤Ï¡¢¡ÖÌÀ¼¨Åª¤Ë¡× -tp x64 ¤òÉÕ¤±¤Æ¥³¥ó¥Ñ¥¤¥ë¤·¤Ê¤±¤ì¤Ð¤Ê¤ê¤Þ¤»¤ó¡£

°Ê²¼¤ÎÆó¤Ä¤Î¥Þ¥·¥ó¾å¤Ç¡¢¥Þ¥·¥ó¤È°Û¤Ê¤ë CPU ¥¿¥¤¥×¤Î¼Â¹Ô¥â¥¸¥å¡¼¥ë¤òºîÀ®¤·¡¢¤½¤Î¼Â¹Ô»þ´Ö¤òÈæ³Ó¤·¤Æ¤ß¤Þ¤¹¡£¤³¤³¤Ç¡¢»ÈÍѤ·¤¿¥×¥í¥°¥é¥àÂêºà¤Ï¡¢¥á¥â¥êÂÓ°è¤ò¥Õ¥ë¤ËɬÍפȤ¹¤ë¡ÖɱÌî¥Ù¥ó¥Á¥Þ¡¼¥¯¡×¤ò»ÈÍѤ·¤Æ¡¢¤½¤ÎÀ­Ç½¤¬¤É¤ÎÄøÅٰۤʤ뤫¤ò¸¡¾Ú¤·¤Þ¤¹¡£

­¡¡¡AMD64 ¥Þ¥·¥ó¡§ Athlon64x2 (2.2GHz)¡¡+¡¡Äã® PC3200¥á¥â¥ê
­¢¡¡EM64T ¥Þ¥·¥ó¡§ Pentium D  (2.8GHz)  +  ¹â® DDR2-667(PC2-5300) ¥á¥â¥ê

 PGI 6.1 ¤ò»ÈÍÑ
 »ÈÍÑ OS ¤Ï¡¢¶¦¤Ë SUSE10.0 (kernel 2.6.13)

­¡ AMD64 ¥Þ¥·¥ó¾å¤Ç¸¡¾Ú¤¹¤ë

¡¡¡¡AMD64 ¥Þ¥·¥ó¾å¤Ç¡¢¤½¤ì¤È¤Ï°Û¤Ê¤ë EM64T ºÇŬ²½¥³¡¼¥É¤ò¼Â¹Ô¤¹¤ë¤È¤½¤ÎÀ­Ç½¤Ï¡¢Â¿¾¯Îô²½¤¹¤ë¤³¤È¤¬Ê¬¤«¤ê¤Þ¤¹¡£

¡Ú AMD64ÀìÍÑ¥³¡¼¥É¤òÀ¸À®¤·¡¢¼Â¹Ô ¡Û

amd64 $ pgf95 -fastsse -O3 -Mprefetch=distance:8,nta -Minfo himenoBMTxp_s.f
amd64 $ ./a.out
  mimax=          129  mjmax=           65  mkmax=           65
  imax=          128  jmax=           64  kmax=           64
Loop executed for 4500 times Gosa : 2.3161024E-06 MFLOPS: 1318.823 time(s): 56.19000 ¡Ú EM64TÀìÍÑ¥³¡¼¥É¤òÀ¸À®¡¢¼Â¹Ô ¡Û amd64 $ pgf95 -fastsse -O3 -Mprefetch=distance:8,nta -Minfo -tp p7-64 himenoBMTxp_s.f amd64 $ ./a.out mimax= 129 mjmax= 65 mkmax= 65 imax= 128 jmax= 64 kmax= 64 Loop executed for 4500 times Gosa : 2.3161024E-06 MFLOPS: 1256.863 time(s): 58.96000

¡¡¡¡PGI Unified Binary ¥³¡¼¥É¤Ï¡¢¾åµ­¤Î AMD64 ÀìÍÑ¥³¡¼¥É¤È¤Û¤ÜƱ¤¸À­Ç½¤òÄ󶡤¹¤ë

¡Ú Unified Binary¥³¡¼¥É¤òÀ¸À®¡¢¼Â¹Ô ¡Û

amd64 $ pgf95 -fastsse -O3 -Mprefetch=distance:8,nta -Minfo -tp x64 himenoBMTxp_s.f
amd64 $ ./a.out
  mimax=          129  mjmax=           65  mkmax=           65
  imax=          128  jmax=           64  kmax=           64
  Loop executed for          3600  times
  Gosa :   8.8548159E-06
  MFLOPS:    1325.368       time(s):    44.73000

­¢ EM64T ¥Þ¥·¥ó¾å¤Ç¸¡¾Ú¤¹¤ë

¡¡¡¡EM64T ¥Þ¥·¥ó¾å¤Ç¡¢¤½¤ì¤È¤Ï°Û¤Ê¤ë AMD64 ºÇŬ²½¥³¡¼¥É¤ò¼Â¹Ô¤¹¤ë¤È¤½¤ÎÀ­Ç½¤Ï¡¢Îô²½¤¹¤ë¤³¤È¤¬Ê¬¤«¤ê¤Þ¤¹¡£

¡Ú EM64TÀìÍÑ¥³¡¼¥É¤òÀ¸À®¡¢¼Â¹Ô ¡Û

em64t $ pgf95 -fastsse -O3 -Mprefetch=distance:8,nta -Minfo himenoBMTxp_s.f
em64t $ ./a.out
  mimax=          129  mjmax=           65  mkmax=           65
  imax=          128  jmax=           64  kmax=           64
Loop executed for 4500 times Gosa : 2.3161024E-06 MFLOPS: 1562.730 time(s): 47.42000 ¡Ú AMD64ÀìÍÑ¥³¡¼¥É¤òÀ¸À®¡¢¼Â¹Ô ¡Û em64t $ pgf95 -fastsse -O3 -Mprefetch=distance:8,nta -Minfo -tp k8-64 himenoBMTxp_s.f em64t $ ./a.out mimax= 129 mjmax= 65 mkmax= 65 imax= 128 jmax= 64 kmax= 64 Loop executed for 6000 times Gosa : 2.4838511E-07 MFLOPS: 1359.095 time(s): 72.70000

¡¡¡¡PGI Unified Binary ¥³¡¼¥É¤Ï¡¢¾åµ­¤Î EM64T ÀìÍÑ¥³¡¼¥É¤È¤Û¤ÜƱ¤¸À­Ç½¤òÄ󶡤¹¤ë

¡Ú Unified Binary¥³¡¼¥É¤òÀ¸À®¡¢¼Â¹Ô ¡Û

amd64 $ pgf95 -fastsse -O3 -Mprefetch=distance:8,nta -Minfo -tp x64 himenoBMTxp_s.f
amd64 $ ./a.out
  mimax=          129  mjmax=           65  mkmax=           65
  imax=          128  jmax=           64  kmax=           64
  Loop executed for          4500  times
  Gosa :   2.3161024E-06
  MFLOPS:    1578.373       time(s):    46.95000

PGI ¥³¥ó¥Ñ¥¤¥é¤Ï¡¢AMD64/EM64T¤Î¥×¥í¥»¥Ã¥µÆÃÀ­¤Ë±þ¤¸¤¿ºÇŬ²½¤À¤±¤Ç¤Ê¤¯¡¢AMD ¼Ò¤Î NUMA ¥¢¡¼¥­¥Æ¥¯¥Á¥ã¤Ë¤â¡ÖºÇŬ²½¡×¤¹¤ëµ¡Ç½¤òÍ­¤·¡¢¥¤¥ó¥Æ¥ë¼Ò¤Î½¾Íè¤Î UMA ¥¢¡¼¥­¥Æ¥¯¥Á¥ã¤Ë¤âºÇŬ²½²Äǽ¤Ê¥³¥ó¥Ñ¥¤¥é¤Ç¤¹¡£¡¡´ØÏ¢¥ê¥ó¥¯





<<¡¡Ìá¤ë


¡¡¥½¥Õ¥Æ¥Ã¥¯¤Ï¡¢PGI À½ÉʤθøÇ§Àµµ¬ÂåÍýŹ¤Ç¤¹

¥µ¥¤¥È¥Þ¥Ã¥× ¤ªÌä¹ç¤»
Copyright 2006 SofTek Systems Inc. All Rights Reserved.