There is not a "formula" to get instructions retired for each 'core', but it can be measured using the hardware performance counters. (There are known bugs in this performance counter event on Haswell processors. It is usually correct, but I have some cases that are systematically off by as much as 20%.)
"Instructions Retired" is available on each logical processor using fixed-function performance counter 0 or using a programmable performance counter. Fixed-function performance counter 0 is accessible by reading MSR 0x309, or by executing the RDPMC instruction with the EAX register set to 0x4000 0000.
Programmable counters are also accessible by either RDMSR instructions (kernel only) or RDPMC instructions, with any of the performance counter event select registers (0x186-0x189 on most Intel products) programmed to 0x004300c0, and the the counts returned by the corresponding MSR in the 0xc1-0xc4 range or by executing the RDPMC instruction with the EAX register set to the counter number (0,1,2,3, corresponding to which of the performance counter event select registers you decided to use).
At the whole-program level, Linux systems can count instructions retired using a simple "perf stat a.out" command, but this will aggregate the counts over all of the Logical Processors that participate in running "a.out" (which could be all of them if "a.out" is a threaded code). Root permissions are required to get counts for all processes running in the system (adding "-a" to the "perf stat" command), and in this mode the aggregation of the counts can be inhibited by adding the "-A" option to the "perf stat" command. This is still a very easy "perf stat -a -A a.out" to get instructions retired on all logical processors in the system while "a.out" is running.