|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
The InfiniBand subnet manager (OpenSM) assigns Local IDentifiers (LIDs) to each port connected to the InfiniBand fabric, and develops a routing table based off of the assigned LIDs.
There are two types of subnet managers:
A typical InfiniBand installation using the OFED package will run the OpenSM subnet manager at system start up after the OpenIB drivers are loaded. This automatic OpenSM is resident in memory, and sweeps the InfiniBand fabric approximately every 5 seconds for new InfiniBand adapters to add to the subnet routing tables. This usage will be sufficient for most installations, and can be controlled using the following commands:
/etc/init.d/opensmd start /etc/init.d/opensmd stop /etc/init.d/opensmd restart /etc/init.d/opensmd statusThere are several instances where the default usage will not be sufficient, however. If the head node is used as a compute node, and resources are at a premium, the OpenSM subnet manager can be set to run once, configure the LIDs and routing tables, and then exit:
opensm -oFor InfiniBand adapters with two ports, a second instance of the subnet manager must be active to enable a subnet on the second port. To begin, enable the subnet manager as above:
service opensmd start
service opensmd start Starting IB Subnet Manager. [ OK ] # ibstat CA 'mlx4_0' CA type: MT4099 Number of ports: 1 Firmware version: 2.30.8000 Hardware version: 1 Node GUID: 0x0002c90300b78240 System image GUID: 0x0002c90300b78243 Port 1: State: Active Physical state: LinkUp Rate: 40 (FDR10) Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x0251486a Port GUID: 0x0002c90300b78241 Link layer: InfiniBand
If we have two ports we need to discover the GUID of the second port:
ibstat -pThis command will output two numbers, one for each port. Use the second number to start up a new OpenSM instance in daemon mode:
opensm -g <0xguid number> -BThere may also be an instance where the head node does not have InfiniBand hardware, but the compute nodes do. In this case, provided a hardware subnet manager is not used, one of the compute nodes must act as the subnet manager.
If there is already a subnet manager is running on the cluster, either
a hardware based version or an OpenSM instance, then running OpenSM on another node will cause the new
instance to be put in a STANDBY state. In this state, the instance listens for the existing OpenSM instance
to fail, and will take over subnet manager duties once a failure state has been detected.
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Once therdma
service is enabled, and the opensm service (if needed) is enabled, and the proper user-space library for the specific hardware has been installed, user spacerdma
operation should be possible. Simple test programs from the libibverbs-utils package are helpful in determining that RDMA operations are working properly. The ibv_devices program will show which devices are present in the system and theibv_devinfo
command will give detailed information about each device. For example:~]$Theibv_devices
device node GUID ------ ---------------- mlx4_0 0002c903003178f0 mlx4_1 f4521403007bcba0 ~]$ibv_devinfo -d mlx4_1
hca_id: mlx4_1 transport: InfiniBand (0) fw_ver: 2.30.8000 node_guid: f452:1403:007b:cba0 sys_image_guid: f452:1403:007b:cba3 vendor_id: 0x02c9 vendor_part_id: 4099 hw_ver: 0x0 board_id: MT_1090120019 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 2048 (4) sm_lid: 2 port_lid: 2 port_lmc: 0x01 link_layer: InfiniBand port: 2 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet~]$ ibstat mlx4_1
CA 'mlx4_1' CA type: MT4099 Number of ports: 2 Firmware version: 2.30.8000 Hardware version: 0 Node GUID: 0xf4521403007bcba0 System image GUID: 0xf4521403007bcba3 Port 1: State: Active Physical state: LinkUp Rate: 56 Base lid: 2 LMC: 1 SM lid: 2 Capability mask: 0x0251486a Port GUID: 0xf4521403007bcba1 Link layer: InfiniBand Port 2: State: Active Physical state: LinkUp Rate: 40 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x04010000 Port GUID: 0xf65214fffe7bcba2 Link layer: Ethernetibv_devinfo
andibstat
commands output slightly different information (such as port MTU exists inibv_devinfo
but not inibstat
output, and the Port GUID exists inibstat
output but not inibv_devinfo
output), and a few things are named differently (for example, the Base local identifier (LID) inibstat
output is the same as theport_lid
output ofibv_devinfo
) Simple ping programs, such as ibping from the infiniband-diags package, can be used to test RDMA connectivity. The ibping program uses a client-server model. You must first start an ibping server on one machine, then run ibping as a client on another machine and tell it to connect to the ibping server. Since we are wanting to test the base RDMA capability, we need to use an RDMA specific address resolution method instead ofIP
addresses for specifying the server. On the server machine, the user can use theibv_devinfo
andibstat
commands to print out theport_lid
(or Base lid) and the Port GUID of the port they want to test (assuming port 1 of the above interface, theport_lid
/Base LID
is2
and Port GUID is0xf4521403007bcba1
)). Then start ibping with the necessary options to bind specifically to the card and port to be tested, and also specifying ibping should run in server mode. You can see the available options to ibping by passing-?
or--help
, but in this instance we will need either the-S
or--Server
option and for binding to the specific card and port we will need either-C
or--Ca
and-P
or--Port
. Note: port in this instance does not denote a network port number, but denotes the physical port number on the card when using a multi-port card. To test connectivity to the RDMA fabric using, for example, the second port of a multi-port card, requires telling ibping to bind to port2
on the card. When using a single port card, or testing the first port on a card, this option is not needed. For example:~]$Then change to the client machine and run ibping. Make note of either the port GUID of the port the server ibping program is bound to, or the local identifier (LID) of the port the server ibping program is bound to. Also, take note which card and port in the client machine is physically connected to the same network as the card and port that was bound to on the server. For example, if the second port of the first card on the server was bound to, and that port is connected to a secondary RDMA fabric, then on the client specify whichever card and port are necessary to also be connected to that secondary fabric. Once these things are known, run the ibping program as a client and connect to the server using either the port LID or GUID that was collected on the server as the address to connect to. For example:ibping -S -C mlx4_1 -P 1
~]$oribping -c 10000 -f -C mlx4_0 -P 1 -L 2
--- rdma-host.example.com.(none) (Lid 2) ibping statistics --- 10000 packets transmitted, 10000 received, 0% packet loss, time 816 ms rtt min/avg/max = 0.032/0.081/0.446 ms~]$This outcome verifies that end to end RDMA communications are working for user space applications. The following error may be encountered:ibping -c 10000 -f -C mlx4_0 -P 1 -G 0xf4521403007bcba1 \
--- rdma-host.example.com.(none) (Lid 2) ibping statistics --- 10000 packets transmitted, 10000 received, 0% packet loss, time 769 ms rtt min/avg/max = 0.027/0.076/0.278 ms~]$This error indicates that the necessary user-space library is not installed. The administrator will need to install one of the user-space libraries (as appropriate for their hardware) listed in section Section 13.4, "InfiniBand and RDMA related software packages". On rare occasions, this can happen if a user installs the wrong arch type for the driver or for libibverbs. For example, if libibverbs is of archibv_devinfo
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 No IB devices foundx86_64
, and libmlx4 is installed but is of typei686
, then this error can result.Note
Many sample applications prefer to use host names or addresses instead of LIDs to open communication between the server and client. For those applications, it is necessary to set up IPoIB before attempting to test end-to-end RDMA communications. The ibping application is unusual in that it will accept simple LIDs as a form of addressing, and this allows it to be a simple test that eliminates possible problems with IPoIB addressing from the test scenario and therefore gives us a more isolated view of whether or not simple RDMA communications are working.
Google matched content |
13.7. Testing Early InfiniBand RDMA operation Red Hat Enterprise Linux 7 - Red Hat Customer Portal
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: