2457 lines
47 KiB
Text
2457 lines
47 KiB
Text
0:00:00.000,0:00:02.740
|
|
My name is Attilio Rao and
|
|
|
|
0:00:02.740,0:00:05.960
|
|
I think that we are in time for the presentation
|
|
|
|
0:00:05.960,0:00:10.870
|
|
I want to ask sorry for my English because it's not really British English but I will
|
|
|
|
0:00:10.870,0:00:12.480
|
|
try to make this
|
|
|
|
|
|
0:00:12.480,0:00:16.359
|
|
a little bit uncomfortable
|
|
|
|
0:00:16.359,0:00:21.300
|
|
Better?
|
|
|
|
0:00:21.300,0:00:24.609
|
|
Ok.Thank you.So we are going to speak about the
|
|
|
|
|
|
0:00:24.609,0:00:28.639
|
|
the locking infrastructure in the FreeBSD kernel
|
|
which
|
|
|
|
0:00:28.639,0:00:33.440
|
|
is a bit interesting topic because
|
|
|
|
0:00:33.440,0:00:38.890
|
|
Its going to be with time very widely discussed on our mailing list not only
|
|
|
|
0:00:38.890,0:00:43.100
|
|
from developer's perspective but even from user's perspective.
|
|
|
|
0:00:43.100,0:00:49.470
|
|
and we will see why later
|
|
|
|
0:00:49.470,0:00:52.990
|
|
In this presentation we will specifically see what
|
|
|
|
0:00:52.990,0:00:55.100
|
|
was the situation
|
|
|
|
0:00:55.100,0:00:57.010
|
|
of the first
|
|
|
|
0:00:57.010,0:00:59.150
|
|
FreeBSD implementations
|
|
|
|
0:00:59.150,0:01:01.120
|
|
and what changed from that
|
|
|
|
0:01:01.120,0:01:06.690
|
|
what specifically what's called the SMPng era
|
|
|
|
0:01:06.690,0:01:07.639
|
|
and what
|
|
|
|
0:01:07.639,0:01:10.500
|
|
we had prior that
|
|
|
|
0:01:10.500,0:01:12.780
|
|
we are going to discuss
|
|
|
|
0:01:12.780,0:01:13.579
|
|
specifically
|
|
|
|
0:01:13.579,0:01:19.160
|
|
locking primitives that has been introduced with time until now
|
|
|
|
0:01:19.160,0:01:20.910
|
|
and
|
|
|
|
0:01:20.910,0:01:24.730
|
|
problems linked to
|
|
|
|
0:01:24.730,0:01:27.620
|
|
parellelism in general and how we solve that in
|
|
|
|
0:01:27.620,0:01:30.950
|
|
the FreeBSD kernel
|
|
|
|
0:01:30.950,0:01:36.200
|
|
You can see a table of content a little bit more detailed as
|
|
|
|
0:01:36.200,0:01:39.850
|
|
listing precisely what we
|
|
|
|
0:01:39.850,0:01:43.210
|
|
some problems like
|
|
|
|
0:01:43.210,0:01:46.159
|
|
Priority Inheritance
|
|
|
|
0:01:46.159,0:01:53.159
|
|
and Adaptive Spinning that we are going to discuss fruitfullly.
|
|
|
|
0:01:53.370,0:01:58.890
|
|
Mostly until FreeBSD 4.x
|
|
|
|
0:01:58.890,0:02:00.830
|
|
We had already moved to multitasking.
|
|
|
|
0:02:00.830,0:02:05.210
|
|
so the slide is a little bit confusing but
|
|
multitasking and preemptive system
|
|
|
|
0:02:05.210,0:02:06.360
|
|
since
|
|
|
|
0:02:06.360,0:02:10.379
|
|
that transition was not very
|
|
|
|
0:02:10.379,0:02:14.180
|
|
was not very difficult to implement in such systems
|
|
because
|
|
|
|
0:02:14.180,0:02:17.479
|
|
if you can see then our uniprocessor machine
|
|
|
|
0:02:17.479,0:02:18.929
|
|
you can get that
|
|
|
|
0:02:18.929,0:02:20.019
|
|
well
|
|
|
|
0:02:20.019,0:02:24.029
|
|
the sequential execution was
|
|
just
|
|
|
|
0:02:24.029,0:02:25.699
|
|
stopped by
|
|
|
|
0:02:25.699,0:02:26.309
|
|
preemption
|
|
|
|
0:02:26.309,0:02:29.400
|
|
and by arrival of interrupts
|
|
|
|
0:02:29.400,0:02:33.969
|
|
so you should adjustment in consistency of data structures
|
|
|
|
0:02:33.969,0:02:36.289
|
|
about these two issues
|
|
|
|
0:02:36.289,0:02:37.079
|
|
more precisely
|
|
|
|
0:02:37.079,0:02:39.079
|
|
we were handling
|
|
|
|
0:02:39.079,0:02:41.779
|
|
the interrupts and transitions through
|
|
|
|
0:02:41.779,0:02:43.370
|
|
a mechanism
|
|
|
|
0:02:43.370,0:02:45.779
|
|
called SPL
|
|
|
|
0:02:45.779,0:02:50.769
|
|
and for kernel threads, threads running in the kernel we were disabling
|
|
|
|
0:02:50.769,0:02:51.379
|
|
preemption
|
|
|
|
0:02:51.379,0:02:53.019
|
|
in order to avoid
|
|
|
|
0:02:53.019,0:02:55.809
|
|
the corruption of the data structure
|
|
|
|
0:02:55.809,0:02:57.519
|
|
This approach while was
|
|
|
|
0:02:57.519,0:03:00.629
|
|
pretty good on uniprocessor machines
|
|
|
|
0:03:00.629,0:03:02.269
|
|
was actually
|
|
|
|
0:03:02.269,0:03:04.270
|
|
impredictable for
|
|
|
|
0:03:04.270,0:03:06.219
|
|
the SMP environments
|
|
|
|
0:03:06.219,0:03:10.199
|
|
more precisely because we had more coures that
|
|
|
|
0:03:10.199,0:03:12.959
|
|
was running thread per time
|
|
|
|
0:03:12.959,0:03:13.909
|
|
and so
|
|
|
|
0:03:13.909,0:03:14.980
|
|
parallel
|
|
|
|
0:03:14.980,0:03:19.309
|
|
accesses to the data structures were possible
|
|
|
|
0:03:19.309,0:03:21.290
|
|
in order to
|
|
|
|
0:03:21.290,0:03:22.469
|
|
to avoid
|
|
|
|
0:03:22.469,0:03:24.149
|
|
big problems in the kernel
|
|
|
|
0:03:24.149,0:03:25.799
|
|
we have to just
|
|
|
|
0:03:25.799,0:03:26.739
|
|
allow
|
|
|
|
0:03:26.739,0:03:28.989
|
|
the entering of
|
|
|
|
0:03:28.989,0:03:32.309
|
|
one thread per time into kernel
|
|
|
|
0:03:32.309,0:03:35.379
|
|
while that was a pretty good approach
|
|
|
|
0:03:35.379,0:03:39.049
|
|
for workloads that were nearly user space
|
|
|
|
0:03:39.049,0:03:40.969
|
|
for work loads
|
|
|
|
0:03:40.969,0:03:45.619
|
|
requiring a lot of IO for example they were wateful because they wasn't
|
|
|
|
0:03:45.619,0:03:47.839
|
|
getting any advantage from the new
|
|
|
|
0:03:47.839,0:03:49.819
|
|
SMP architecture
|
|
|
|
0:03:49.819,0:03:52.749
|
|
like the parallelism was basically zero
|
|
|
|
0:03:52.749,0:03:55.189
|
|
at least in the kernel
|
|
|
|
0:03:55.189,0:03:55.949
|
|
in order
|
|
|
|
0:03:55.949,0:04:00.650
|
|
to fix that a new project was created
|
|
called SMP
|
|
|
|
0:04:00.650,0:04:01.470
|
|
New generation
|
|
|
|
|
|
0:04:01.470,0:04:05.169
|
|
or NG
|
|
|
|
0:04:05.169,0:04:07.309
|
|
as you can see it from the slide
|
|
|
|
0:04:07.309,0:04:10.329
|
|
the entering in the kernel was preempted
|
|
|
|
0:04:10.329,0:04:12.569
|
|
by using Big Lock
|
|
|
|
0:04:12.569,0:04:19.569
|
|
called BKL basically
|
|
|
|
0:04:23.199,0:04:28.109
|
|
With FreeBSD 5.x we had the SMP new generation project
|
|
|
|
0:04:28.109,0:04:30.110
|
|
basically it was
|
|
|
|
0:04:30.110,0:04:31.509
|
|
a sanitization of the
|
|
|
|
0:04:31.509,0:04:34.539
|
|
of all
|
|
|
|
0:04:34.539,0:04:40.039
|
|
our kernel and the engineering over lot of mechanism inside our kernel. We could see it
|
|
|
|
0:04:40.039,0:04:44.709
|
|
FreeBSD 4.x and FreeBSD 5.x as mainly two different kernels
|
|
|
|
0:04:44.709,0:04:45.550
|
|
because of
|
|
|
|
0:04:45.550,0:04:50.150
|
|
substantial subsystem were rewritten and
|
|
|
|
0:04:50.150,0:04:51.830
|
|
were written with the
|
|
|
|
0:04:51.830,0:04:56.949
|
|
idea to use and implement a real parallelism in mind.
|
|
|
|
0:04:56.949,0:05:02.610
|
|
we can say that basically it was a major task a very big task
|
|
|
|
0:05:02.610,0:05:04.029
|
|
and that it required
|
|
|
|
0:05:04.029,0:05:06.669
|
|
a lot of years to be brought
|
|
|
|
0:05:06.669,0:05:08.900
|
|
in a good shape at least
|
|
|
|
0:05:08.900,0:05:11.379
|
|
In Italy, the people gave
|
|
|
|
0:05:11.379,0:05:13.069
|
|
a lot of
|
|
|
|
0:05:13.069,0:05:16.350
|
|
complaining about the
|
|
|
|
0:05:16.350,0:05:20.430
|
|
un-robustness of FreeBSD 5.x but
|
|
|
|
0:05:20.430,0:05:22.249
|
|
probably that's because they couldn't even
|
|
|
|
0:05:22.249,0:05:28.929
|
|
see that the changes were really really important and really huge
|
|
|
|
0:05:28.929,0:05:34.490
|
|
however for FreeBSD 5.x based this initial SMP system
|
|
|
|
0:05:34.490,0:05:37.070
|
|
inheriting from BSD/OS
|
|
|
|
0:05:37.070,0:05:39.309
|
|
that kindly
|
|
|
|
0:05:39.309,0:05:42.699
|
|
released this code above that
|
|
|
|
0:05:42.699,0:05:44.009
|
|
and the
|
|
|
|
|
|
0:05:44.009,0:05:46.579
|
|
the process was break up in
|
|
|
|
0:05:46.579,0:05:51.069
|
|
some precise tasks at least in Italy
|
|
|
|
0:05:51.069,0:05:55.429
|
|
Mainly the first things was introducing in the kernel
|
|
|
|
0:05:55.429,0:06:00.180
|
|
new set of atomic instruction and locking primitives
|
|
|
|
0:06:00.180,0:06:01.520
|
|
Then introducing
|
|
|
|
0:06:01.520,0:06:05.380
|
|
an abstraction called interrupt threads that we are going to discuss
|
|
|
|
0:06:05.380,0:06:06.929
|
|
rather later but
|
|
|
|
0:06:06.929,0:06:12.319
|
|
it was basically restored completely the interrupt mechanism that was in the FreeBSD 4.x
|
|
|
|
0:06:14.439,0:06:16.490
|
|
the the BKL
|
|
|
|
0:06:16.490,0:06:19.210
|
|
lock was moved to a real
|
|
|
|
0:06:19.210,0:06:20.679
|
|
mutex called Giant
|
|
|
|
0:06:20.679,0:06:23.180
|
|
that still exists in our kernel
|
|
|
|
0:06:23.180,0:06:26.660
|
|
and they were introduced some threading primitives
|
|
|
|
0:06:26.660,0:06:28.019
|
|
the
|
|
|
|
0:06:28.019,0:06:30.499
|
|
like and and on and
|
|
|
|
0:06:30.499,0:06:32.280
|
|
threading primitives
|
|
|
|
0:06:32.280,0:06:34.319
|
|
called also KSE
|
|
|
|
0:06:34.319,0:06:37.009
|
|
which are actually never used in our kernel
|
|
|
|
0:06:37.009,0:06:41.620
|
|
and that being their being exit out in the past year
|
|
|
|
0:06:41.620,0:06:43.409
|
|
and the
|
|
|
|
0:06:43.409,0:06:45.259
|
|
slowly of the porting of
|
|
|
|
0:06:45.259,0:06:50.459
|
|
all the older subsystems to a finer locking was started
|
|
|
|
0:06:50.459,0:06:55.919
|
|
I have to say this task is not still completed, its still going on but
|
|
|
|
0:06:55.919,0:06:58.889
|
|
we are really good shape about that
|
|
|
|
0:06:58.889,0:07:02.429
|
|
just few subsystems remain which are still Giant protected
|
|
|
|
0:07:02.429,0:07:05.939
|
|
and with new release that we're going to ship this year, I think that we made
|
|
|
|
0:07:05.939,0:07:10.220
|
|
a very huge step forward in this direction
|
|
|
|
0:07:12.319,0:07:18.599
|
|
really the SMPng has been considered closed around the end of
|
|
|
|
0:07:18.599,0:07:20.600
|
|
2007
|
|
|
|
0:07:20.600,0:07:22.579
|
|
but the
|
|
|
|
0:07:22.579,0:07:23.819
|
|
the
|
|
|
|
0:07:23.819,0:07:27.539
|
|
the important parts where this initial moving
|
|
|
|
0:07:27.539,0:07:32.669
|
|
I rather thing that's not listed here but I can tell you is that
|
|
|
|
0:07:32.669,0:07:38.279
|
|
even that if Giant was preventing any parallelism initial parallelism
|
|
|
|
0:07:38.279,0:07:43.219
|
|
that were imported new kernel memory allocator that was
|
|
|
|
0:07:43.219,0:07:45.009
|
|
that I discovered
|
|
|
|
0:07:45.009,0:07:48.439
|
|
and the scheduler was move with a separate lock
|
|
|
|
0:07:48.439,0:07:50.449
|
|
in order to
|
|
|
|
0:07:50.449,0:07:52.080
|
|
start getting some
|
|
|
|
0:07:52.080,0:07:54.699
|
|
a little bit of concurrency
|
|
|
|
0:07:54.699,0:07:59.099
|
|
a real concurrency
|
|
|
|
0:07:59.099,0:08:01.520
|
|
the
|
|
|
|
0:08:01.520,0:08:06.280
|
|
before to speak about FreeBSD specifics we can start digging in about
|
|
|
|
0:08:06.280,0:08:08.219
|
|
what kind of
|
|
|
|
0:08:08.219,0:08:12.729
|
|
of locking primitives you can find in our kernel.
|
|
|
|
0:08:12.729,0:08:15.780
|
|
from a more historical point of view
|
|
|
|
0:08:15.780,0:08:19.710
|
|
we have some versions of mutex which
|
|
|
|
0:08:19.710,0:08:20.919
|
|
I assume
|
|
|
|
0:08:20.919,0:08:24.809
|
|
people here knows about that but I'm going to give a little explanation
|
|
|
|
0:08:24.809,0:08:26.449
|
|
for people that doesn't know
|
|
|
|
0:08:26.449,0:08:28.939
|
|
a mutex is basically
|
|
|
|
0:08:28.939,0:08:30.739
|
|
a lock allowing to access
|
|
|
|
0:08:30.739,0:08:36.700
|
|
to some protected data's thread to just one thread per time
|
|
|
|
0:08:36.700,0:08:38.150
|
|
so if a thread
|
|
|
|
0:08:38.150,0:08:39.690
|
|
owns the lock,
|
|
|
|
0:08:39.690,0:08:40.760
|
|
owns the mutex
|
|
|
|
0:08:40.760,0:08:42.539
|
|
other threads
|
|
|
|
0:08:42.539,0:08:44.039
|
|
won't be able to
|
|
|
|
0:08:44.039,0:08:46.090
|
|
to access to this until
|
|
|
|
0:08:46.090,0:08:48.730
|
|
this lock is released
|
|
|
|
0:08:48.730,0:08:50.430
|
|
we offer even
|
|
|
|
0:08:50.430,0:08:54.890
|
|
some kind of locks called R/W lock Read/Write lock
|
|
|
|
0:08:54.890,0:08:57.920
|
|
which are basically a
|
|
|
|
0:08:57.920,0:09:03.050
|
|
locks that can be acquired in two different versions
|
|
|
|
0:09:03.050,0:09:04.060
|
|
one version
|
|
|
|
0:09:04.060,0:09:07.980
|
|
is the write lock which is the same as the mutex just one
|
|
|
|
0:09:07.980,0:09:10.010
|
|
in the protected part per time
|
|
|
|
0:09:10.010,0:09:13.860
|
|
and other one is the read mode which basically
|
|
|
|
0:09:13.860,0:09:15.100
|
|
allows
|
|
|
|
0:09:15.100,0:09:18.410
|
|
all the thread willing to acquire to read mode to
|
|
|
|
0:09:18.410,0:09:23.699
|
|
concurrently adjust to the structure but prevents the threads from
|
|
|
|
0:09:23.699,0:09:25.390
|
|
writing to the protected path.
|
|
|
|
0:09:25.390,0:09:28.890
|
|
while the reader..while they are readers
|
|
|
|
0:09:28.890,0:09:30.280
|
|
then we have even
|
|
|
|
0:09:30.280,0:09:33.030
|
|
the locks called the Read Mostly
|
|
|
|
0:09:33.030,0:09:37.570
|
|
which are basically the same of Read/Write Locks but are
|
|
|
|
0:09:37.570,0:09:42.500
|
|
they have some optimization in order to make the Read
|
|
|
|
0:09:42.500,0:09:44.180
|
|
part be really fast
|
|
|
|
0:09:44.180,0:09:46.930
|
|
and to have like
|
|
|
|
0:09:46.930,0:09:48.180
|
|
zero overhead
|
|
|
|
0:09:48.180,0:09:51.410
|
|
zero overhead kind of lock
|
|
|
|
0:09:51.410,0:09:53.350
|
|
from the read path while
|
|
|
|
0:09:53.350,0:09:55.590
|
|
probably the write path is even
|
|
|
|
0:09:55.590,0:09:59.210
|
|
heavier than the other one but if you think about cases that
|
|
|
|
0:09:59.210,0:10:01.710
|
|
just
|
|
|
|
|
|
0:10:01.710,0:10:02.750
|
|
where
|
|
|
|
0:10:02.750,0:10:06.980
|
|
there are a lot of reader chases and very few writer chases you can find that a
|
|
|
|
0:10:06.980,0:10:08.220
|
|
very useful
|
|
|
|
0:10:08.220,0:10:11.070
|
|
very useful primitive
|
|
|
|
0:10:11.070,0:10:11.850
|
|
then we have
|
|
|
|
0:10:11.850,0:10:14.360
|
|
some form of Wait channels
|
|
|
|
0:10:14.360,0:10:16.030
|
|
Wait channels
|
|
|
|
0:10:16.030,0:10:17.140
|
|
basically are what
|
|
|
|
0:10:17.140,0:10:21.700
|
|
generalizations of what other people con call like
|
|
|
|
0:10:22.470,0:10:24.240
|
|
condition variable and
|
|
|
|
0:10:24.240,0:10:28.240
|
|
they basically let that thread sleep
|
|
|
|
0:10:28.240,0:10:30.870
|
|
under some conditions that are
|
|
|
|
0:10:30.870,0:10:35.200
|
|
that are previously started with some
|
|
|
|
0:10:35.200,0:10:36.610
|
|
some variables
|
|
|
|
0:10:36.610,0:10:37.150
|
|
usually
|
|
|
|
0:10:37.150,0:10:39.500
|
|
having a Wait channel means that its
|
|
|
|
0:10:39.500,0:10:45.080
|
|
chases are controlled through another locking primitive like a mutex
|
|
|
|
0:10:45.080,0:10:46.640
|
|
or R/Wlock
|
|
|
|
0:10:46.640,0:10:52.010
|
|
and so often the Wait channel is associated to its
|
|
|
|
0:10:52.010,0:10:53.620
|
|
to its locking primitive
|
|
|
|
0:10:53.620,0:11:00.140
|
|
usually if you have no necessity to use a Wait channel without a primitive
|
|
|
|
0:11:00.140,0:11:04.150
|
|
a locking primitive you probably have bad code
|
|
|
|
0:11:04.150,0:11:06.830
|
|
but there are some edge cases
|
|
|
|
0:11:06.830,0:11:09.660
|
|
with that seem possible
|
|
|
|
0:11:09.660,0:11:13.550
|
|
As last thing FreeBSD sub primitive counting semaphore
|
|
|
|
0:11:13.550,0:11:15.290
|
|
even if thats considered not featured
|
|
|
|
0:11:15.290,0:11:17.710
|
|
as we are going to see I think they're going to see it and
|
|
|
|
0:11:17.710,0:11:23.570
|
|
its usage is pretty much discouraged
|
|
|
|
0:11:23.570,0:11:28.320
|
|
basically FreeBSD you can consider locking primitive divided into three classes
|
|
|
|
0:11:28.320,0:11:31.250
|
|
three classes of
|
|
|
|
0:11:31.250,0:11:32.450
|
|
of locking
|
|
|
|
0:11:32.450,0:11:34.090
|
|
based mainly in
|
|
|
|
0:11:34.090,0:11:35.600
|
|
particular
|
|
|
|
0:11:35.600,0:11:37.340
|
|
from an outside perspective
|
|
|
|
0:11:37.340,0:11:38.690
|
|
based on the behavior
|
|
|
|
0:11:38.690,0:11:42.680
|
|
the contending threads as you regard of the lock
|
|
|
|
0:11:42.680,0:11:48.100
|
|
for example in case of a mutex you can can get that
|
|
|
|
0:11:48.100,0:11:53.360
|
|
spinning and blocking mutex do very different things about the contenders
|
|
|
|
0:11:53.360,0:11:59.680
|
|
as we are going to see more of this later
|
|
|
|
0:11:59.680,0:12:03.410
|
|
usually in the traditional literature,
|
|
|
|
0:12:03.410,0:12:05.430
|
|
there are just two
|
|
|
|
0:12:05.430,0:12:07.280
|
|
cases of the lock classes mainly
|
|
|
|
0:12:07.280,0:12:08.620
|
|
you will find the
|
|
|
|
0:12:08.620,0:12:11.200
|
|
spinning lock and the blocking lock
|
|
|
|
0:12:11.200,0:12:14.370
|
|
or what they called the sleeping lock
|
|
|
|
0:12:14.370,0:12:16.670
|
|
the I think that
|
|
|
|
0:12:16.670,0:12:21.020
|
|
as we're going to see why we have three types I think that things will be clear but
|
|
|
|
0:12:21.020,0:12:27.100
|
|
if you have any questions please ask us. Thats not a problem
|
|
|
|
0:12:27.100,0:12:29.930
|
|
Spinning primitives as I told you
|
|
|
|
0:12:29.930,0:12:32.810
|
|
allows the contesting thread to
|
|
|
|
0:12:32.810,0:12:36.120
|
|
to check the status of the lock periodically
|
|
|
|
0:12:36.120,0:12:37.590
|
|
and the
|
|
|
|
0:12:37.590,0:12:40.420
|
|
and they just do busy waiting around
|
|
|
|
0:12:40.420,0:12:41.890
|
|
the locking variable
|
|
|
|
0:12:41.890,0:12:46.400
|
|
as the spinning primitive FreeBSD just offers mutex
|
|
|
|
0:12:46.400,0:12:50.689
|
|
What are the problems linked with this kind of, with this class
|
|
|
|
0:12:50.689,0:12:53.869
|
|
of locks? Mainly its that CPU
|
|
|
|
0:12:53.869,0:12:58.130
|
|
remains busy without doing really nothing useful
|
|
|
|
0:12:58.130,0:12:59.740
|
|
it happens
|
|
|
|
0:12:59.740,0:13:03.620
|
|
that if several threads contest on the
|
|
|
|
0:13:03.620,0:13:04.870
|
|
on the locks
|
|
|
|
0:13:04.870,0:13:08.210
|
|
basically they share the same cache line where the lock is
|
|
|
|
0:13:08.210,0:13:10.220
|
|
where the lock is
|
|
|
|
0:13:10.220,0:13:12.400
|
|
that means that
|
|
|
|
0:13:12.400,0:13:17.470
|
|
contesting or sharing a cache line is a lot underlying activity
|
|
|
|
0:13:17.470,0:13:20.150
|
|
on a lot of architectures like for example
|
|
|
|
0:13:20.150,0:13:23.660
|
|
having a lot of snoop messages between CPUs
|
|
|
|
0:13:23.660,0:13:26.450
|
|
and some buses
|
|
|
|
0:13:26.450,0:13:28.120
|
|
some buses traffic
|
|
|
|
0:13:28.120,0:13:31.980
|
|
which means in a variety operations
|
|
|
|
0:13:31.980,0:13:35.740
|
|
and the last things even the most important you can note is that interrupts
|
|
|
|
0:13:35.740,0:13:37.120
|
|
are disabled
|
|
|
|
0:13:37.120,0:13:39.330
|
|
while spin locks are held
|
|
|
|
0:13:39.330,0:13:40.810
|
|
that was
|
|
|
|
0:13:40.810,0:13:42.979
|
|
that happens mainly because there are
|
|
|
|
0:13:42.979,0:13:45.140
|
|
there were identified in the past by some
|
|
|
|
0:13:45.140,0:13:47.970
|
|
kind of deadlocks possible
|
|
|
|
0:13:47.970,0:13:50.180
|
|
if you were going to lead
|
|
|
|
0:13:50.180,0:13:51.710
|
|
the spin locks
|
|
|
|
0:13:51.710,0:13:55.900
|
|
the interrupts enabled while holding a spin lock. In particular
|
|
|
|
0:13:55.900,0:13:58.180
|
|
you could find that there are
|
|
|
|
0:13:58.180,0:14:02.530
|
|
some problems with the interrupts angling good in the botom half that was
|
|
|
|
0:14:02.530,0:14:05.040
|
|
going to deadlock
|
|
|
|
0:14:05.040,0:14:10.250
|
|
Its not very simple to understand the thing so I've left out
|
|
|
|
0:14:10.250,0:14:12.360
|
|
but if you want to know
|
|
|
|
0:14:12.360,0:14:15.990
|
|
we could speak later probably
|
|
|
|
0:14:17.820,0:14:21.320
|
|
with spinning primitives we are even blocking primitives
|
|
|
|
0:14:21.320,0:14:22.890
|
|
blocking primitives
|
|
|
|
0:14:25.260,0:14:26.860
|
|
allows the
|
|
|
|
0:14:26.860,0:14:28.440
|
|
basically the contenders to be
|
|
|
|
0:14:28.440,0:14:30.980
|
|
descheduled from the runqueue
|
|
|
|
0:14:30.980,0:14:35.790
|
|
to be put on another kind of container
|
|
|
|
0:14:35.790,0:14:38.000
|
|
put on another kind of container
|
|
|
|
0:14:38.000,0:14:40.489
|
|
and basically
|
|
|
|
0:14:40.489,0:14:41.399
|
|
context switch immediately
|
|
|
|
0:14:41.399,0:14:44.360
|
|
immediately.
|
|
|
|
0:14:44.360,0:14:49.440
|
|
then we put again on runqueue of the scheduler just once the just when the owner
|
|
|
|
0:14:49.440,0:14:51.570
|
|
is going to release the lock
|
|
|
|
0:14:51.570,0:14:53.260
|
|
and it will be the owner
|
|
|
|
0:14:53.260,0:14:56.930
|
|
the owner that was going to
|
|
|
|
0:14:56.930,0:15:00.310
|
|
do all the operations about that
|
|
|
|
0:15:00.310,0:15:05.550
|
|
we have several primitives implemented as blocking primitives like mutexes
|
|
|
|
0:15:05.550,0:15:10.470
|
|
R/W locks and R-M locks
|
|
|
|
0:15:11.430,0:15:13.140
|
|
with
|
|
|
|
0:15:13.140,0:15:16.890
|
|
basically with
|
|
|
|
0:15:16.890,0:15:21.780
|
|
blocking primitives we have a lot of advantages over the spinning mutex
|
|
|
|
0:15:21.780,0:15:24.650
|
|
like having the contenders
|
|
|
|
0:15:24.650,0:15:26.560
|
|
that
|
|
|
|
0:15:26.560,0:15:27.590
|
|
that sleeps
|
|
|
|
0:15:27.590,0:15:31.840
|
|
or that blocks avoids CPU busyness
|
|
|
|
0:15:31.840,0:15:34.660
|
|
and mainly we can leave the
|
|
|
|
0:15:34.660,0:15:37.150
|
|
we can leave the
|
|
|
|
0:15:37.150,0:15:42.040
|
|
we can leave that basically the interrupts out
|
|
|
|
0:15:42.040,0:15:45.760
|
|
that happens mainly because the interrupts code is just allowed
|
|
|
|
0:15:45.760,0:15:50.710
|
|
at least the bottom of one is just allowed
|
|
|
|
0:15:50.710,0:15:52.070
|
|
to use spin locks
|
|
|
|
0:15:52.070,0:15:56.049
|
|
probably if it was going to use blocking primitives
|
|
|
|
0:15:56.049,0:16:01.060
|
|
we wouldnt have been able to disable interrupts here
|
|
|
|
0:16:01.060,0:16:02.239
|
|
There are however some
|
|
|
|
0:16:02.239,0:16:04.790
|
|
big drawbacks that as you will see
|
|
|
|
0:16:04.790,0:16:07.210
|
|
we handle in FreeBSD
|
|
|
|
0:16:07.210,0:16:11.280
|
|
in order to make the blobking primitives our
|
|
|
|
0:16:11.280,0:16:13.540
|
|
how could I tell
|
|
|
|
0:16:13.540,0:16:16.440
|
|
the first choice in terms of blocking
|
|
|
|
0:16:16.440,0:16:19.690
|
|
where the problem called Priority Inversion
|
|
|
|
0:16:19.690,0:16:21.899
|
|
and we have
|
|
|
|
0:16:21.899,0:16:27.589
|
|
the problem that context switches are very heavy in particular
|
|
|
|
0:16:27.589,0:16:30.209
|
|
on machines that FreeBSD uses as referral
|
|
|
|
0:16:30.209,0:16:33.500
|
|
like E38 and the MD64
|
|
|
|
0:16:33.500,0:16:37.940
|
|
but as you're going to see we've used two techniques in order to
|
|
|
|
0:16:37.940,0:16:40.020
|
|
to cope with that
|
|
|
|
0:16:42.020,0:16:45.830
|
|
another thing is that while you can't
|
|
|
|
0:16:45.830,0:16:47.920
|
|
allow
|
|
|
|
0:16:47.920,0:16:50.089
|
|
context switches while having
|
|
|
|
0:16:50.089,0:16:52.570
|
|
while holding spin lock
|
|
|
|
0:16:52.570,0:16:55.249
|
|
it's obvious you can't
|
|
|
|
0:16:55.249,0:16:59.580
|
|
acquire a locking primitive while holding a spin lock
|
|
|
|
0:16:59.580,0:17:02.110
|
|
that's an important rule in FreeBSD
|
|
|
|
0:17:02.110,0:17:06.089
|
|
that sometimes its confused and often its not
|
|
|
|
0:17:06.089,0:17:07.470
|
|
observed
|
|
|
|
0:17:07.470,0:17:09.929
|
|
that leads to block refusal
|
|
|
|
0:17:12.170,0:17:16.610
|
|
usually you will always prefer a blocking primitive for a spin lock
|
|
|
|
0:17:16.610,0:17:22.159
|
|
if not in some very particular condition like what
|
|
|
|
0:17:22.159,0:17:25.010
|
|
Alrick said about the interrupt and even
|
|
|
|
0:17:25.010,0:17:26.090
|
|
about the
|
|
|
|
0:17:28.160,0:17:30.570
|
|
some parts that are very very short
|
|
|
|
0:17:30.570,0:17:33.629
|
|
we should have some example in the kernel even if I can
|
|
|
|
0:17:33.629,0:17:35.390
|
|
I can tell you one right now
|
|
|
|
0:17:35.390,0:17:38.770
|
|
I have no idea actually
|
|
|
|
0:17:38.770,0:17:39.500
|
|
so that
|
|
|
|
0:17:39.500,0:17:43.740
|
|
we're going to see the problemslinked with the blocking primitives the first one is
|
|
|
|
0:17:43.740,0:17:45.679
|
|
called Priority Inversion
|
|
|
|
0:17:45.679,0:17:46.389
|
|
basically
|
|
|
|
0:17:46.389,0:17:49.130
|
|
it could happen that like a thread A
|
|
|
|
0:17:49.130,0:17:51.410
|
|
which has a priority
|
|
|
|
0:17:51.410,0:17:55.380
|
|
owns a lock. call it L for example
|
|
|
|
0:17:55.380,0:17:58.710
|
|
then another thread with another priority than this one
|
|
|
|
0:17:58.710,0:18:00.690
|
|
locks on this lock
|
|
|
|
0:18:00.690,0:18:03.299
|
|
what happens is that the second thread
|
|
|
|
0:18:03.299,0:18:04.120
|
|
the thread B
|
|
|
|
0:18:04.120,0:18:05.870
|
|
for example
|
|
|
|
0:18:05.870,0:18:08.920
|
|
will need to wait for a lower priority thread
|
|
|
|
0:18:08.920,0:18:13.070
|
|
to finish its work load
|
|
|
|
0:18:13.070,0:18:15.120
|
|
we
|
|
|
|
0:18:15.120,0:18:17.780
|
|
solve this problem actually in the
|
|
|
|
0:18:17.780,0:18:21.170
|
|
kernel using a technique called priority propagation
|
|
|
|
0:18:21.170,0:18:22.020
|
|
basically
|
|
|
|
0:18:22.020,0:18:24.620
|
|
what happens is that priority of thread B
|
|
|
|
0:18:25.760,0:18:27.880
|
|
is lent to thread A
|
|
|
|
0:18:27.880,0:18:31.460
|
|
until it doesn't release the lock
|
|
|
|
0:18:31.460,0:18:34.760
|
|
of its directly implemented in the container
|
|
|
|
0:18:34.760,0:18:36.180
|
|
the turnstiles
|
|
|
|
0:18:37.870,0:18:39.530
|
|
while that could be done
|
|
|
|
0:18:39.530,0:18:44.290
|
|
even on the primitive it has been much convenient to use the container for
|
|
|
|
0:18:44.290,0:18:45.190
|
|
that
|
|
|
|
0:18:45.190,0:18:45.990
|
|
because
|
|
|
|
0:18:45.990,0:18:52.990
|
|
it was going to offer some advantage we are going to see right now
|
|
|
|
0:18:53.030,0:18:54.240
|
|
just note that
|
|
|
|
0:18:54.240,0:18:56.090
|
|
Read locks
|
|
|
|
0:18:56.090,0:18:57.310
|
|
cannot support
|
|
|
|
0:18:57.310,0:19:03.430
|
|
priority propagation fixes for read lock that happens because you'd like to
|
|
|
|
0:19:03.430,0:19:07.290
|
|
the turnstile should keep track of all the readers
|
|
|
|
0:19:07.290,0:19:11.100
|
|
and these would be very very expensive from
|
|
|
|
0:19:11.100,0:19:12.880
|
|
from a
|
|
|
|
0:19:12.880,0:19:15.540
|
|
from a point of view of the overhead
|
|
|
|
0:19:15.540,0:19:19.800
|
|
and even I think I've tried to do something in this regard and I
|
|
|
|
0:19:19.800,0:19:24.050
|
|
saw that there was some races that were trying to
|
|
|
|
0:19:24.050,0:19:29.390
|
|
acquire a spin lock as base even in fast path so it was a
|
|
|
|
0:19:29.390,0:19:31.320
|
|
an impredicable way
|
|
|
|
0:19:31.320,0:19:32.380
|
|
I will tell
|
|
|
|
0:19:32.380,0:19:37.200
|
|
at least for what we found so far
|
|
|
|
0:19:37.200,0:19:37.630
|
|
basically
|
|
|
|
0:19:37.630,0:19:39.070
|
|
what happens
|
|
|
|
0:19:39.070,0:19:42.150
|
|
about the priority propagation is that the
|
|
|
|
0:19:42.150,0:19:44.830
|
|
the threads and the turnstiles
|
|
|
|
0:19:44.830,0:19:47.000
|
|
are chained together
|
|
|
|
0:19:47.000,0:19:48.350
|
|
the thread
|
|
|
|
0:19:48.350,0:19:50.970
|
|
owns the a pointer
|
|
|
|
0:19:50.970,0:19:53.710
|
|
to wrench the turnstile is sleeping on
|
|
|
|
0:19:53.710,0:19:58.540
|
|
and the turnstile owns a pointer above
|
|
|
|
0:19:58.540,0:20:00.549
|
|
the owner of the lock
|
|
|
|
0:20:00.549,0:20:04.620
|
|
what happens is that for example in this case we have
|
|
|
|
0:20:05.080,0:20:08.070
|
|
a sleeper which is going to sleep on a turnstile
|
|
|
|
0:20:08.070,0:20:08.990
|
|
the first lock
|
|
|
|
0:20:08.990,0:20:13.470
|
|
which has a priority of one hundred and twenty eight
|
|
|
|
0:20:14.120,0:20:15.520
|
|
the turnstile
|
|
|
|
0:20:15.520,0:20:18.370
|
|
to the pointer
|
|
|
|
0:20:18.370,0:20:20.570
|
|
ts_owner knows which is its owner
|
|
|
|
0:20:20.570,0:20:26.150
|
|
and this owner has a priority of two hundred and fifty six
|
|
|
|
0:20:26.150,0:20:31.120
|
|
well as you know higher level, higher value means lower priority. so if this is
|
|
0:20:31.120,0:20:34.960
|
|
a suitable pace for priority propagation
|
|
|
|
0:20:34.960,0:20:40.820
|
|
but what happens is that this owner is actually sleeping on another turnstile
|
|
|
|
0:20:40.820,0:20:43.419
|
|
and the other owner
|
|
|
|
0:20:43.419,0:20:48.820
|
|
of the second turnstile has always the same priority of its sleepers
|
|
|
|
0:20:48.820,0:20:50.750
|
|
so
|
|
|
|
0:20:50.750,0:20:55.530
|
|
just propagating priority to the first owner was just unuseful because the first
|
|
|
|
0:20:55.530,0:20:56.340
|
|
one
|
|
|
|
0:20:56.340,0:20:57.320
|
|
could
|
|
|
|
0:20:57.320,0:20:58.760
|
|
still
|
|
|
|
0:20:58.760,0:21:00.580
|
|
keep the chain to a
|
|
|
|
0:21:00.580,0:21:04.820
|
|
lower priority so it's was going to be propagated to the first one
|
|
|
|
0:21:04.820,0:21:07.679
|
|
actually running
|
|
|
|
0:21:07.679,0:21:09.870
|
|
owner of the chain
|
|
|
|
0:21:09.870,0:21:14.670
|
|
this is the situation after the propagation as you can see all of threads in the chain
|
|
|
|
0:21:14.670,0:21:16.559
|
|
has the same priority
|
|
|
|
0:21:16.559,0:21:17.950
|
|
either possible
|
|
|
|
0:21:17.950,0:21:24.480
|
|
in this case the one the last one arriving
|
|
|
|
0:21:25.750,0:21:31.720
|
|
there are question about that
|
|
|
|
0:21:31.720,0:21:34.780
|
|
no?
|
|
|
|
0:21:34.780,0:21:36.760
|
|
yeah when the
|
|
|
|
0:21:36.760,0:21:39.720
|
|
when the for example the third owner
|
|
|
|
0:21:39.720,0:21:41.679
|
|
the second owner there
|
|
|
|
0:21:41.679,0:21:43.659
|
|
when it goes to release the lock
|
|
|
|
0:21:43.659,0:21:47.010
|
|
it basically brings back the priority to the
|
|
|
|
0:21:47.010,0:21:49.340
|
|
to the
|
|
|
|
0:21:49.340,0:21:52.490
|
|
twenty hundred and sixty five to all the chains
|
|
|
|
0:21:52.490,0:21:54.650
|
|
he is responsible for
|
|
|
|
0:21:54.650,0:22:01.179
|
|
so it just happens at locking operation
|
|
|
|
0:22:01.179,0:22:04.159
|
|
and that is what we do about the Priority Inversion
|
|
|
|
0:22:04.159,0:22:09.970
|
|
inorder to fix instead the overhead given by the
|
|
|
|
0:22:09.970,0:22:14.030
|
|
big amount of context switch we use another technique called adaptive spinning
|
|
|
|
0:22:14.030,0:22:16.030
|
|
basically
|
|
|
|
0:22:16.030,0:22:20.260
|
|
as the context switch brings a lot of overhead
|
|
|
|
0:22:22.310,0:22:26.090
|
|
we prefer to not do
|
|
|
|
0:22:26.090,0:22:27.770
|
|
completely a context switch
|
|
|
|
0:22:27.770,0:22:30.760
|
|
in the case the lock owner is still running
|
|
|
|
0:22:30.760,0:22:32.190
|
|
on a runqueue
|
|
|
|
0:22:32.190,0:22:38.340
|
|
because there are very good chance that the owner is going to release the lock very early
|
|
|
|
0:22:40.440,0:22:43.990
|
|
that means that for example
|
|
|
|
0:22:43.990,0:22:46.070
|
|
we choose just to spin
|
|
|
|
0:22:46.070,0:22:49.149
|
|
in order to wait that the state of the
|
|
|
|
0:22:49.149,0:22:52.240
|
|
lock changed or the state of the owner
|
|
|
|
0:22:52.240,0:22:57.660
|
|
was going to change like the owner going to sleep on another turstile
|
|
|
|
0:22:57.660,0:22:59.140
|
|
and the
|
|
|
|
0:22:59.140,0:23:03.270
|
|
basically we, there have been very big measurement even in the
|
|
|
|
0:23:03.270,0:23:07.510
|
|
another operating system like solice that
|
|
|
|
0:23:07.510,0:23:12.300
|
|
where I think we brought in this approach the first time
|
|
|
|
0:23:12.300,0:23:16.430
|
|
that we're we're showing
|
|
|
|
0:23:16.430,0:23:23.430
|
|
a very big improvement in performance from this technique
|
|
|
|
0:23:25.790,0:23:30.640
|
|
apart from the two types of primitives, these are sleeping primitives
|
|
|
|
0:23:30.640,0:23:36.120
|
|
now there is a consideration we have to make about that
|
|
|
|
0:23:36.120,0:23:38.110
|
|
basically sleeping primitives
|
|
|
|
0:23:38.110,0:23:42.320
|
|
should be in theory just the
|
|
|
|
0:23:42.320,0:23:44.340
|
|
the wait channels
|
|
|
|
0:23:44.340,0:23:49.170
|
|
wait channels should have been the only one implemented using the
|
|
|
|
0:23:49.170,0:23:50.630
|
|
container called
|
|
|
|
0:23:50.630,0:23:52.760
|
|
sleepqueue
|
|
|
|
0:23:52.760,0:23:53.910
|
|
but
|
|
|
|
0:23:53.910,0:23:56.170
|
|
due to some legacy
|
|
|
|
0:23:56.170,0:24:01.000
|
|
the actually the sleepqueues were used to implement other kind of other
|
|
|
|
0:24:01.000,0:24:03.290
|
|
kinds of lock like the
|
|
|
|
0:24:03.290,0:24:04.219
|
|
lockmgr
|
|
|
|
0:24:04.219,0:24:08.080
|
|
and the sx locks and the
|
|
|
|
0:24:08.080,0:24:11.100
|
|
basically the
|
|
|
|
0:24:11.100,0:24:13.679
|
|
semaphore's condvars too
|
|
|
|
0:24:13.679,0:24:16.010
|
|
that has been this is
|
|
|
|
0:24:16.010,0:24:18.809
|
|
going to give some problems actually
|
|
|
|
0:24:18.809,0:24:19.350
|
|
because
|
|
|
|
0:24:20.450,0:24:24.820
|
|
as we're going to see
|
|
|
|
0:24:24.820,0:24:26.889
|
|
and as you can see on the line too
|
|
|
|
0:24:26.889,0:24:27.929
|
|
in the FreeBSD
|
|
|
|
0:24:27.929,0:24:31.600
|
|
while sleeping threads should not hold any kind of lock
|
|
|
|
0:24:31.600,0:24:33.809
|
|
neither blocking nor spinning
|
|
|
|
0:24:33.809,0:24:36.770
|
|
thats a simple thing to explain
|
|
|
|
0:24:36.770,0:24:40.200
|
|
we just want to enforce very
|
|
|
|
0:24:40.200,0:24:43.490
|
|
we just want to enforce
|
|
|
|
0:24:43.490,0:24:46.060
|
|
correct semantics of locking
|
|
|
|
0:24:46.060,0:24:47.880
|
|
so imagine to keep a lock
|
|
|
|
0:24:47.880,0:24:50.190
|
|
a blocking primitive while
|
|
|
|
0:24:50.190,0:24:50.729
|
|
sleeping
|
|
|
|
0:24:50.729,0:24:53.010
|
|
it's going to waste a lot of time
|
|
|
|
0:24:53.010,0:24:56.530
|
|
because all the contenders are going to
|
|
|
|
0:24:56.530,0:24:58.760
|
|
are going to start on the
|
|
|
|
0:24:58.760,0:25:01.400
|
|
lock owner which is sleeping
|
|
|
|
0:25:01.400,0:25:03.120
|
|
basically in fact what
|
|
|
|
0:25:03.120,0:25:07.169
|
|
as you should know condition variables do usually is to drop the lock
|
|
|
|
0:25:07.169,0:25:11.070
|
|
once it was passed to the primitives
|
|
|
|
0:25:11.070,0:25:12.380
|
|
in this case
|
|
|
|
0:25:14.170,0:25:18.249
|
|
basically we just dont allow that this means that's the
|
|
|
|
0:25:18.249,0:25:23.160
|
|
the same conditions happens even for other kinds of lock
|
|
|
|
0:25:23.160,0:25:25.540
|
|
lockmgr and the sx lock
|
|
|
|
0:25:25.540,0:25:26.860
|
|
so you can't hold
|
|
|
|
0:25:26.860,0:25:29.410
|
|
a mutex for example
|
|
|
|
0:25:29.410,0:25:33.640
|
|
of blocking mutex an R/W lock while trying to acquire
|
|
|
|
0:25:33.640,0:25:38.559
|
|
a lockmgr and sx
|
|
|
|
0:25:38.559,0:25:41.850
|
|
this is going to create some problems because
|
|
|
|
0:25:41.850,0:25:46.830
|
|
in some parts that is unavoidable so you have to drop the lock for example and try
|
|
|
|
0:25:46.830,0:25:48.190
|
|
to acquire
|
|
|
|
0:25:48.190,0:25:49.770
|
|
the other primitive
|
|
|
|
0:25:49.770,0:25:51.320
|
|
which is going to
|
|
|
|
0:25:53.400,0:25:59.110
|
|
and so can create some raisee problems
|
|
|
|
0:26:00.130,0:26:04.779
|
|
as the sleepqueues are born just to serve wait channels
|
|
|
|
0:26:04.779,0:26:09.190
|
|
they don't track owner too so they dont care about priority propagation and priority inversion problem
|
|
|
|
0:26:09.190,0:26:14.430
|
|
just because sleepqueues entirely should not have work
|
|
|
|
0:26:14.430,0:26:20.150
|
|
so for example lockmgr and sx have not priority propagation
|
|
|
|
0:26:20.150,0:26:22.360
|
|
systems and the
|
|
|
|
0:26:22.360,0:26:29.360
|
|
so they are discouraged to be used even for this thing mainly
|
|
|
|
0:26:31.590,0:26:34.930
|
|
sure
|
|
|
|
0:26:36.780,0:26:39.000
|
|
it's you mean why it's not
|
|
|
|
0:26:39.000,0:26:41.790
|
|
why doesn't blocking primitives exist yeah?
|
|
|
|
0:26:41.790,0:26:44.250
|
|
so imagine that for example the
|
|
|
|
0:26:44.250,0:26:45.570
|
|
you have a wait channel
|
|
|
|
0:26:45.570,0:26:47.679
|
|
condvar a condition variable
|
|
|
|
0:26:47.679,0:26:50.950
|
|
or M sleep
|
|
|
|
0:26:50.950,0:26:52.090
|
|
M sleep
|
|
|
|
0:26:52.090,0:26:54.910
|
|
the primitive that allows you to sleep on
|
|
|
|
0:26:54.910,0:26:57.850
|
|
a condition variable for example
|
|
|
|
0:26:57.850,0:26:58.870
|
|
however
|
|
|
|
0:27:00.510,0:27:02.270
|
|
the you are
|
|
|
|
0:27:02.270,0:27:03.350
|
|
using the blocking
|
|
|
|
0:27:03.350,0:27:06.930
|
|
the using the turnstile you will go to a
|
|
|
|
0:27:06.930,0:27:12.110
|
|
always the mechanism of priority propagation and priority inversion handling.Its
|
|
|
|
0:27:12.110,0:27:13.760
|
|
not very
|
|
|
|
0:27:13.760,0:27:14.970
|
|
it's pretty
|
|
|
|
0:27:14.970,0:27:17.320
|
|
it's not a simple operation
|
|
|
|
0:27:17.320,0:27:20.219
|
|
it acquires even some kind of spin locks
|
|
|
|
0:27:20.219,0:27:22.650
|
|
in order to avoid some raises
|
|
|
|
0:27:22.650,0:27:23.340
|
|
and so
|
|
|
|
0:27:23.340,0:27:24.289
|
|
it
|
|
|
|
0:27:24.289,0:27:26.590
|
|
so it has an overhead
|
|
|
|
0:27:26.590,0:27:31.770
|
|
if you do in this case it will be not to be useful it will be completely unuseful to have
|
|
|
|
0:27:31.770,0:27:34.159
|
|
a mechanism like that so
|
|
|
|
0:27:34.159,0:27:37.410
|
|
in theory if you just would have used
|
|
|
|
0:27:37.410,0:27:41.320
|
|
a sleeping the sleepqueue for wait channels
|
|
|
|
0:27:41.320,0:27:42.990
|
|
you are to add
|
|
|
|
0:27:42.990,0:27:46.640
|
|
bigperformance boost than just using the turnstile
|
|
|
|
0:27:46.640,0:27:49.330
|
|
for the same problem
|
|
|
|
0:27:49.330,0:27:51.310
|
|
in theory
|
|
|
|
0:27:51.310,0:27:54.780
|
|
but what happened is that other locks are implementedo
|
|
|
|
0:27:54.780,0:27:55.839
|
|
using this sleepqueue
|
|
|
|
0:27:55.839,0:27:58.070
|
|
that should have not be happened
|
|
|
|
0:27:58.070,0:27:59.260
|
|
on the principle
|
|
|
|
0:27:59.260,0:28:02.960
|
|
really I'm not sure who introduced the sx lock
|
|
|
|
0:28:02.960,0:28:04.440
|
|
I'm actually not sure
|
|
|
|
0:28:04.440,0:28:06.280
|
|
and even the lockmgr
|
|
|
|
0:28:06.280,0:28:09.870
|
|
but
|
|
|
|
0:28:09.870,0:28:12.340
|
|
however
|
|
|
|
0:28:12.340,0:28:17.669
|
|
as you could have seen before the three containers create a hierarchy that
|
|
|
|
0:28:17.669,0:28:20.090
|
|
should not be broken like
|
|
|
|
0:28:20.090,0:28:21.639
|
|
you have spinqueues
|
|
|
|
0:28:21.639,0:28:26.900
|
|
you have spin locks you have blocking primitives and sleeping primitives and
|
|
|
|
0:28:26.900,0:28:31.470
|
|
you cannot acquire you cannot mix them there are precise rules like
|
|
|
|
0:28:31.470,0:28:33.710
|
|
on the top the sleeping primitive
|
|
|
|
0:28:33.710,0:28:37.690
|
|
in the mid the blocking primitive and in the end the spinning primitive
|
|
|
|
0:28:38.900,0:28:44.440
|
|
the main choice will be to use blocking primitives always
|
|
|
|
0:28:44.440,0:28:48.240
|
|
because as you can see we handled a lot of problem that they have
|
|
|
|
0:28:48.240,0:28:49.659
|
|
and the practice
|
|
|
|
0:28:49.659,0:28:52.229
|
|
they have proven to be very
|
|
|
|
0:28:52.229,0:28:53.799
|
|
very helpful
|
|
|
|
0:28:53.799,0:28:54.999
|
|
but sometimes
|
|
|
|
0:28:56.789,0:28:58.790
|
|
some nasty conditions can happen
|
|
|
|
0:28:58.790,0:29:02.900
|
|
for example one of the most widespread is the
|
|
|
|
0:29:02.900,0:29:06.350
|
|
using a mallok with a flag M_WAITOK
|
|
|
|
0:29:06.350,0:29:11.240
|
|
in FreeBSD that means that if the allocator is pretty busy or going to
|
|
|
|
0:29:11.240,0:29:12.680
|
|
to sleep
|
|
|
|
0:29:12.680,0:29:15.760
|
|
in order to retrieve your memory
|
|
|
|
0:29:15.760,0:29:17.890
|
|
and if you do with a lock hold
|
|
|
|
0:29:17.890,0:29:22.080
|
|
you're going to violate one of our rules and its not
|
|
|
|
0:29:22.080,0:29:23.440
|
|
possible
|
|
|
|
0:29:23.440,0:29:25.320
|
|
another one is just we just
|
|
|
|
0:29:25.320,0:29:28.299
|
|
said before like call a sleeping lock while
|
|
|
|
0:29:28.299,0:29:32.090
|
|
holding a blocking primitive
|
|
|
|
0:29:33.390,0:29:37.530
|
|
in the next example in the next I'm going to show you a way to
|
|
|
|
0:29:37.530,0:29:41.140
|
|
to handle for example the Mallock case
|
|
|
|
0:29:41.140,0:29:42.520
|
|
and similar
|
|
|
|
0:29:42.520,0:29:45.000
|
|
but the that usually
|
|
|
|
0:29:46.830,0:29:47.620
|
|
usually that
|
|
|
|
0:29:47.620,0:29:49.980
|
|
are not very common cases
|
|
|
|
0:29:49.980,0:29:52.920
|
|
at least for simple parts
|
|
|
|
0:29:52.920,0:29:56.280
|
|
you should even try to avoid the
|
|
|
|
0:29:56.280,0:30:03.280
|
|
the
|
|
|
|
0:30:04.620,0:30:06.180
|
|
yes
|
|
|
|
0:30:06.180,0:30:07.050
|
|
even in the
|
|
|
|
0:30:07.050,0:30:09.120
|
|
in the
|
|
|
|
0:30:09.120,0:30:10.220
|
|
wait channel
|
|
|
|
0:30:10.220,0:30:14.530
|
|
as in the FreeBSD you can differentiate between the condition variables and
|
|
|
|
0:30:14.530,0:30:15.720
|
|
Msleep
|
|
|
|
0:30:15.720,0:30:17.510
|
|
usually Msleep was
|
|
|
|
0:30:17.510,0:30:22.210
|
|
really Msleep was introduced as the first primitive
|
|
|
|
0:30:22.210,0:30:26.190
|
|
but it has an interface very very difficult to
|
|
|
|
0:30:26.190,0:30:28.460
|
|
to make saner and to understand
|
|
|
|
0:30:28.460,0:30:30.470
|
|
at least for
|
|
|
|
0:30:30.470,0:30:31.220
|
|
for people
|
|
|
|
0:30:31.220,0:30:32.120
|
|
which are
|
|
|
|
0:30:32.120,0:30:34.960
|
|
comfortable with
|
|
|
|
0:30:34.960,0:30:39.260
|
|
with interface of condition variable that we all saw but they are
|
|
|
|
0:30:39.260,0:30:40.649
|
|
newer primitive
|
|
|
|
0:30:40.649,0:30:42.660
|
|
mainly there is
|
|
|
|
0:30:42.660,0:30:44.400
|
|
so far the newer code
|
|
|
|
0:30:44.400,0:30:46.960
|
|
what you should do is just to
|
|
|
|
0:30:46.960,0:30:49.000
|
|
use condition variables
|
|
|
|
0:30:49.000,0:30:50.659
|
|
and not Msleep
|
|
|
|
0:30:50.659,0:30:51.630
|
|
basically
|
|
|
|
0:30:51.630,0:30:56.220
|
|
Msleep should be dropped off but they have avery nice feature which
|
|
|
|
0:30:56.220,0:31:02.669
|
|
is the possibility to specify a wake up priority on the sleeping threads
|
|
|
|
0:31:02.669,0:31:04.740
|
|
once they are asleep
|
|
|
|
0:31:04.740,0:31:07.470
|
|
that condvar still doesn't
|
|
|
|
0:31:07.470,0:31:12.430
|
|
maybe if we could port these features to the condition variables we we will be able
|
|
|
|
0:31:12.430,0:31:13.659
|
|
to completely drop off Msleep
|
|
|
|
0:31:13.659,0:31:18.529
|
|
from the work arena
|
|
|
|
0:31:18.529,0:31:20.450
|
|
this is a
|
|
|
|
0:31:20.450,0:31:25.580
|
|
simple case that it's going to show a way to
|
|
|
|
0:31:26.620,0:31:30.670
|
|
a simple way to deal with the for example
|
|
|
|
0:31:30.670,0:31:34.100
|
|
condition I told before the Mallock willing to
|
|
|
|
0:31:34.100,0:31:35.390
|
|
to sleep
|
|
|
|
0:31:35.390,0:31:38.260
|
|
and the doing that while holding a lock
|
|
|
|
0:31:38.260,0:31:45.070
|
|
as you see we have some fake C as some members like flags
|
|
|
|
0:31:45.070,0:31:47.659
|
|
and an object called instructful
|
|
|
|
0:31:47.659,0:31:49.940
|
|
which needs to be allocated
|
|
|
|
0:31:49.940,0:31:54.400
|
|
and that they are protected by an internal lock
|
|
|
|
0:31:54.400,0:31:58.810
|
|
you imagine that for example the fake C create
|
|
|
|
0:31:58.810,0:32:02.269
|
|
holds lock of the object and does some things
|
|
|
|
0:32:02.269,0:32:04.460
|
|
which are not important
|
|
|
|
0:32:04.460,0:32:07.650
|
|
then in the end for example it's going to
|
|
|
|
0:32:07.650,0:32:09.170
|
|
to allocate
|
|
|
|
0:32:09.170,0:32:14.110
|
|
the FC object and that should be protected in
|
|
|
|
0:32:14.110,0:32:16.470
|
|
in anatomic part
|
|
|
|
0:32:16.470,0:32:20.030
|
|
something you can do is just to set the flag
|
|
|
|
0:32:20.030,0:32:22.160
|
|
for that
|
|
|
|
0:32:22.160,0:32:22.730
|
|
saying
|
|
|
|
0:32:22.730,0:32:28.460
|
|
the allocation is going to happen if you're adjust to this structure concurrently
|
|
|
|
0:32:28.460,0:32:29.899
|
|
just keep the allocation
|
|
|
|
0:32:29.899,0:32:31.500
|
|
and that's what we do
|
|
|
|
0:32:31.500,0:32:32.919
|
|
we check for this flag
|
|
|
|
0:32:32.919,0:32:37.969
|
|
and if its present it means that another thread is still
|
|
|
|
0:32:37.969,0:32:40.149
|
|
is already allocating and we just keep
|
|
|
|
0:32:40.149,0:32:46.360
|
|
so otherwise we set it and then we have locked the mutex
|
|
|
|
0:32:46.360,0:32:49.100
|
|
then we allocate the memory for the
|
|
|
|
0:32:49.100,0:32:50.610
|
|
for the object
|
|
|
|
0:32:50.610,0:32:52.450
|
|
acquire again the lock
|
|
|
|
0:32:52.450,0:32:54.860
|
|
and we simply have seen
|
|
|
|
0:32:54.860,0:33:00.200
|
|
please note that Ive used the temporary storage for that in order to make
|
|
|
|
0:33:00.200,0:33:01.830
|
|
some search on
|
|
|
|
0:33:01.830,0:33:03.280
|
|
like the MS
|
|
|
|
0:33:03.280,0:33:04.180
|
|
about the
|
|
|
|
0:33:04.180,0:33:05.500
|
|
the pointer
|
|
|
|
0:33:05.500,0:33:10.700
|
|
it was just a tricky note that you verify that really the structure was not
|
|
|
|
0:33:10.700,0:33:14.330
|
|
really allocated
|
|
|
|
0:33:14.330,0:33:16.600
|
|
and so that we can get some
|
|
|
|
0:33:16.600,0:33:21.870
|
|
kind of session about that
|
|
|
|
0:33:22.640,0:33:26.340
|
|
one of the biggest innovation that was brought to FreeBSD
|
|
|
|
0:33:26.340,0:33:30.120
|
|
about the locking primitive about the locking primitives
|
|
|
|
0:33:30.120,0:33:33.770
|
|
are the interrupts that
|
|
|
|
0:33:34.640,0:33:36.850
|
|
mainly
|
|
|
|
0:33:36.850,0:33:40.820
|
|
this is pretty simple to explain maybe
|
|
|
|
0:33:40.820,0:33:44.070
|
|
As the top half remains basically the same
|
|
|
|
0:33:44.070,0:33:49.790
|
|
and was going to handle the ISR for the interrupt line for example
|
|
|
|
0:33:49.790,0:33:54.330
|
|
the bottom half changed set and running the interrupts
|
|
|
|
0:33:54.330,0:33:58.700
|
|
handler is solid on that line as it was traditionally happened
|
|
|
|
0:33:58.700,0:34:02.140
|
|
it was going just to schedule a thread
|
|
|
|
0:34:02.140,0:34:04.980
|
|
that was going to run the
|
|
|
|
0:34:04.980,0:34:06.940
|
|
the interrupt handler in a
|
|
|
|
0:34:06.940,0:34:12.389
|
|
--- context and not the kind of --it was going to happen
|
|
|
|
0:34:12.389,0:34:15.509
|
|
traditionally in a lot of unique system
|
|
|
|
0:34:16.699,0:34:23.179
|
|
this has the big advantage that in using your own context you can
|
|
|
|
0:34:23.179,0:34:24.429
|
|
basically
|
|
|
|
0:34:24.990,0:34:29.889
|
|
you're not forced to use spin locks and you can do a lot of other fancy things
|
|
|
|
0:34:29.889,0:34:32.209
|
|
this necesity came over because
|
|
|
|
0:34:32.209,0:34:33.149
|
|
often
|
|
|
|
0:34:33.149,0:34:38.529
|
|
interrupts handlers needs to adjust to some
|
|
|
|
0:34:38.529,0:34:42.589
|
|
needs to adjust to some subsystem locks and the
|
|
|
|
0:34:42.589,0:34:45.799
|
|
as we were going to use blocking ---around
|
|
|
|
0:34:45.799,0:34:50.379
|
|
we had the necessity to support the
|
|
|
|
0:34:50.379,0:34:52.589
|
|
the locking of the
|
|
|
|
0:34:52.589,0:34:57.119
|
|
the possibilities of wide mutex actually
|
|
|
|
0:34:57.559,0:35:01.759
|
|
A similar thing was implemented using taskqueues
|
|
|
|
0:35:01.759,0:35:02.879
|
|
previously
|
|
|
|
0:35:02.879,0:35:04.010
|
|
and the sometimes it
|
|
|
|
0:35:04.010,0:35:05.740
|
|
I think I saw a lenux too
|
|
|
|
0:35:05.740,0:35:08.439
|
|
using taskqueues maybe
|
|
|
|
0:35:08.439,0:35:10.029
|
|
but the
|
|
|
|
0:35:10.029,0:35:14.709
|
|
it was basically something similar but not exactly in this way
|
|
|
|
0:35:14.709,0:35:16.809
|
|
a actually FreeBSD
|
|
|
|
0:35:16.809,0:35:20.559
|
|
from the release seven
|
|
|
|
0:35:20.559,0:35:22.579
|
|
the interrupt threads
|
|
|
|
0:35:22.579,0:35:24.659
|
|
are this model is a little bit changed
|
|
|
|
0:35:24.659,0:35:26.499
|
|
in order to include the
|
|
|
|
0:35:26.499,0:35:29.739
|
|
a new mechanism called the filtering
|
|
|
|
0:35:29.739,0:35:36.249
|
|
we have interrupt filters that basically if set then directly
|
|
|
|
0:35:36.249,0:35:39.809
|
|
directly
|
|
|
|
0:35:39.809,0:35:40.879
|
|
schedule the thread
|
|
|
|
0:35:40.879,0:35:43.209
|
|
linked to the parked line
|
|
|
|
0:35:43.209,0:35:46.619
|
|
they just check for
|
|
|
|
0:35:46.619,0:35:50.939
|
|
they just let run some new thing in the kernel or context
|
|
|
|
0:35:50.939,0:35:52.449
|
|
that will decide if
|
|
|
|
0:35:52.449,0:35:56.709
|
|
handle directly to requests or just schedule the kernel
|
|
|
|
0:35:56.709,0:35:59.739
|
|
it's like if you have the old bottom handler
|
|
|
|
0:35:59.739,0:36:04.529
|
|
that add the possibility to register a handler
|
|
|
|
0:36:04.529,0:36:08.869
|
|
still running in interrupt context and at the same time
|
|
|
|
0:36:08.869,0:36:12.009
|
|
decide if scheduled or not
|
|
|
|
0:36:12.009,0:36:14.499
|
|
so that it's no
|
|
|
|
0:36:14.499,0:36:18.579
|
|
no more madatory
|
|
|
|
0:36:18.579,0:36:22.919
|
|
So I think that the first part is going to finish so if you have some questions we can
|
|
|
|
0:36:22.919,0:36:23.430
|
|
handle
|
|
|
|
0:36:23.430,0:36:28.699
|
|
right now
|
|
|
|
0:36:28.699,0:36:35.699
|
|
this should be material for the second part actually
|
|
|
|
0:36:45.279,0:36:48.529
|
|
a new bus for example
|
|
|
|
0:36:48.529,0:36:51.259
|
|
some
|
|
|
|
0:36:51.259,0:36:55.769
|
|
some drivers that kind of a frequently used I'm not sure but which ones but all
|
|
|
|
0:36:55.769,0:37:00.049
|
|
the big ones are compared to finer locking
|
|
|
|
0:37:00.049,0:37:03.109
|
|
%um
|
|
|
|
0:37:03.109,0:37:07.479
|
|
actually the problem is not which parts are under Giant
|
|
|
|
0:37:07.479,0:37:08.530
|
|
well how we could
|
|
|
|
0:37:08.530,0:37:12.380
|
|
optimize the locking of some subsystems because
|
|
|
|
0:37:12.380,0:37:15.079
|
|
for example we have to virtual memory
|
|
|
|
0:37:15.079,0:37:17.910
|
|
which is not on the Giant but its
|
|
|
|
0:37:17.910,0:37:19.719
|
|
not locate
|
|
|
|
0:37:19.719,0:37:24.400
|
|
optimally and it's going to bring a lot of contention
|
|
|
|
0:37:24.400,0:37:26.230
|
|
so
|
|
|
|
0:37:26.230,0:37:30.329
|
|
it's not under Giant but it should be optimized
|
|
|
|
0:37:30.329,0:37:37.329
|
|
because the parts under Giant are very tiny.New bus for example
|
|
|
|
0:37:37.599,0:37:44.599
|
|
some parts relating to the VFS on the mounting
|
|
but yet a very short parts
|
|
|
|
0:37:44.979,0:37:51.979
|
|
I'm not sure about others
|
|
|
|
0:37:57.479,0:37:59.170
|
|
sorry
|
|
|
|
0:38:02.069,0:38:08.549
|
|
well usually it should be moved completely but
|
|
|
|
0:38:08.549,0:38:11.019
|
|
yes
|
|
|
|
0:38:11.019,0:38:12.539
|
|
it could
|
|
|
|
0:38:32.909,0:38:34.809
|
|
okay although
|
|
|
|
0:38:34.809,0:38:38.289
|
|
in the kernel we have a basically
|
|
|
|
0:38:38.289,0:38:39.450
|
|
%um
|
|
|
|
0:38:39.450,0:38:43.019
|
|
as you should know we already imported the trays for example
|
|
|
|
0:38:43.019,0:38:47.839
|
|
and I have wondered, I have submitted by developed
|
|
|
|
0:38:47.839,0:38:48.669
|
|
my country
|
|
|
|
0:38:48.669,0:38:51.479
|
|
called ---some patches that brings the
|
|
|
|
0:38:51.479,0:38:54.689
|
|
the ----- directly in our locking
|
|
|
|
0:38:54.689,0:38:55.699
|
|
in order to
|
|
|
|
0:38:55.699,0:38:58.890
|
|
allow it to be tracked with the trace.
|
|
|
|
0:38:58.890,0:39:02.009
|
|
which is very nice but it's still not completed
|
|
|
|
0:39:02.009,0:39:03.310
|
|
we are reviewing
|
|
|
|
0:39:03.310,0:39:08.309
|
|
above that we have a very the other useful tool called the lock profiling
|
|
|
|
0:39:08.309,0:39:12.039
|
|
that has been very helpful in the past in order to
|
|
|
|
0:39:12.039,0:39:14.110
|
|
find the most contended lock
|
|
|
|
0:39:14.110,0:39:17.469
|
|
and the to try to propose them to finer locking
|
|
|
|
0:39:17.469,0:39:20.589
|
|
so at least for the kernel we have such mechanism
|
|
|
|
0:39:20.589,0:39:22.719
|
|
I'm not sure what should
|
|
|
|
0:39:22.719,0:39:26.640
|
|
have been the user space.I'm sure we've not something similar
|
|
|
|
0:39:26.640,0:39:28.310
|
|
but maybe other systems
|
|
|
|
0:39:28.310,0:39:29.469
|
|
have
|
|
|
|
0:39:29.469,0:39:30.749
|
|
similar tools
|
|
|
|
0:39:30.749,0:39:36.039
|
|
I don't know I just know FreeBSD so
|
|
|
|
0:39:58.479,0:39:59.220
|
|
not sure
|
|
|
|
0:39:59.220,0:39:59.919
|
|
would you repeat
|
|
|
|
0:39:59.919,0:40:03.879
|
|
some voice please. No I can't hear
|
|
|
|
0:40:03.879,0:40:05.509
|
|
It seems to me that
|
|
|
|
0:40:05.509,0:40:08.269
|
|
you don't you have to do all the work that you do with locking
|
|
|
|
0:40:08.269,0:40:11.469
|
|
well if you're not on SMP right?
|
|
|
|
0:40:11.469,0:40:13.029
|
|
well no
|
|
|
|
0:40:13.029,0:40:15.259
|
|
it's not right because the
|
|
|
|
0:40:15.259,0:40:20.210
|
|
you have to protect even against some mechanism like preemption
|
|
|
|
0:40:20.210,0:40:25.989
|
|
which is going to be tricky.It is dfferent implemented than FreeBSD 4.x so
|
|
|
|
0:40:25.989,0:40:28.909
|
|
it's going to be with preemption its like
|
|
|
|
0:40:28.909,0:40:30.099
|
|
from
|
|
|
|
0:40:30.099,0:40:34.479
|
|
it's like if you have a real SMP system from our technical point of view
|
|
|
|
0:40:34.479,0:40:35.809
|
|
so you have to handle
|
|
|
|
0:40:35.809,0:40:38.339
|
|
problems typical of that
|
|
|
|
0:40:38.339,0:40:43.249
|
|
really in the kernel we have other kind of synchronization like atomics
|
|
|
|
0:40:43.249,0:40:45.500
|
|
I don't, I should have had
|
|
|
|
0:40:45.500,0:40:50.609
|
|
a slide about that but it disappeared so I can tell you by voice
|
|
|
|
0:40:50.609,0:40:55.170
|
|
its well like we have the possibility to use atomic instruction in the
|
|
|
|
0:40:55.170,0:40:57.369
|
|
in FreeBSD kernel directly
|
|
|
|
0:40:57.369,0:40:59.249
|
|
but the
|
|
|
|
0:40:59.249,0:41:03.119
|
|
to use even memory bytes linked with them
|
|
|
|
0:41:03.119,0:41:08.869
|
|
the only pitfall is that you cannot really trust about the
|
|
|
|
0:41:08.869,0:41:10.469
|
|
cash coherency
|
|
|
|
0:41:10.469,0:41:14.339
|
|
because as long as it's Im be specific you can just
|
|
|
|
0:41:14.339,0:41:16.989
|
|
you can just be trust about
|
|
|
|
0:41:16.989,0:41:21.879
|
|
what happens in your CPU where use the atomic and where to use the memory byte
|
|
|
|
0:41:21.879,0:41:26.349
|
|
you cannot make assumptions about the what happens about if other CPUs
|
|
|
|
0:41:26.349,0:41:29.289
|
|
can see your modifiers or not
|
|
|
|
0:41:29.289,0:41:31.640
|
|
and if the cache can handle that
|
|
|
|
0:41:31.640,0:41:37.119
|
|
we have a specific primitives in order to for example disable preemption
|
|
|
|
0:41:37.119,0:41:39.379
|
|
which are the critical sections
|
|
|
|
0:41:39.379,0:41:42.179
|
|
critical entry and critical exit
|
|
|
|
0:41:42.179,0:41:45.309
|
|
that what you call them you are not to
|
|
|
|
0:41:45.309,0:41:48.219
|
|
the preemption is simply allowed
|
|
|
|
0:41:48.219,0:41:54.749
|
|
it's that's a very fast primitive so there is not much overhead
|
|
|
|
0:41:54.749,0:41:56.049
|
|
so there's not much overhead
|
|
|
|
0:41:56.049,0:42:00.679
|
|
we also have a way to disable interrupt which is unofficial.I will tell
|
|
|
|
0:42:00.679,0:42:03.079
|
|
that
|
|
|
|
0:42:03.079,0:42:07.720
|
|
because you can do that in machine dependant way
|
|
|
|
0:42:07.720,0:42:10.619
|
|
with a spin lock entry and spin lock exit
|
|
|
|
0:42:10.619,0:42:14.989
|
|
and then
|
|
|
|
0:42:14.989,0:42:16.049
|
|
yeah that you can
|
|
|
|
0:42:16.049,0:42:17.389
|
|
even disable
|
|
|
|
0:42:17.389,0:42:19.479
|
|
some thread migration
|
|
|
|
0:42:19.479,0:42:22.940
|
|
using skid primitives
|
|
|
|
0:42:22.940,0:42:25.319
|
|
that are very useful
|
|
|
|
0:42:25.319,0:42:29.779
|
|
when you are going to adjust for example to per-CPU datas
|
|
|
|
0:42:29.779,0:42:33.270
|
|
and you have several chases and you don't want the CPU migrate
|
|
|
|
0:42:33.270,0:42:34.200
|
|
from that
|
|
|
|
0:42:34.200,0:42:36.619
|
|
thread migrate from that CPU
|
|
|
|
0:42:36.619,0:42:38.729
|
|
because you could read different
|
|
|
|
0:42:38.729,0:42:45.369
|
|
values from different CPU then
|
|
|
|
0:42:45.369,0:42:46.479
|
|
I'm not sure
|
|
|
|
0:42:46.479,0:42:52.079
|
|
if there is something else okay
|
|
|
|
0:42:52.079,0:42:57.229
|
|
questions? no?
|
|
|
|
0:42:57.229,0:42:58.189
|
|
so i'll see you later"
|