ID: 481
I have 350 examples that were annotated with binary labels by two different annotators. How do I compute inter-annotator agreement, including confidence interval?
Aligned LLM's Response:
Great
,
I
'
m
here
to
help
!
Comput
ing
inter
-
annot
ator
agreement
is
an
important
task
in
machine
learning
,
especially
when
working
with
annot
ated
data
.
Here
'
s
a
step
-
by
-
step
guide
on
how
to
compute
inter
-
annot
ator
agreement
,
including
confidence
intervals
,
using
your
3
5
0
annot
ated
examples
:
\n
1
.
Calcul
ate
the
An
notation
Ag
re
ement
:
\n
First
,
calculate
the
annotation
agreement
between
the
two
annot
ators
using
the
following
formula
:
\n
Annotation
Ag
re
ement
=
(
Number
of
Ag
re
ed
Ex
amples
)
/
(
Total
Number
of
Ex
amples
)
\n
Where
"
Number
of
Ag
re
ed
Ex
amples
"
is
the
number
of
examples
that
both
annot
ators
l
abeled
in
the
same
way
,
and
"
Total
Number
of
Ex
amples
"
is
the
total
number
of
examples
in
the
dataset
.
\n
For
example
,
if
the
two
annot
ators
agreed
on
1
0
0
out
of
3
5
0
examples
,
the
annotation
agreement
would
be
0
.
2
9
(
1
0
0
/
3
5
0
).
\n
2
.
Calcul
ate
the
Inter
-
An
not
ator
Dist
ance
:
\n
Next
,
calculate
the
inter
-
annot
ator
distance
using
the
following
formula
:
\n
Inter
-
An
not
ator
Dist
ance
=
(
Max
imum
An
notation
D
ifference
)
/
(
Min
imum
An
notation
Ag
re
ement
)
\n
Where
"
Max
imum
An
notation
D
ifference
"
is
the
maximum
difference
between
the
two
annot
ators
'
labels
for
any
example
,
and
"
Min
imum
An
notation
Ag
re
ement
"
is
the
minimum
agreement
between
the
two
annot
ators
'
labels
for
any
example
.
\n
For
example
,
if
the
maximum
annotation
difference
between
the
two
annot
ators
was
5
and
the
minimum
annotation
agreement
was
0
.
2
,
the
inter
-
annot
ator
distance
would
be
5
/
0
.
2
=
2
5
.
\n
3
.
Calcul
ate
the
Conf
idence
Inter
val
:
\n
To
calculate
the
confidence
interval
,
you
can
use
a
statistical
method
called
the
B
eta
Distribution
.
The
B
eta
Distribution
is
a
continuous
probability
distribution
that
models
the
probability
of
a
binary
outcome
(
e
.
g
.,
agreement
or
disag
re
ement
)
based
on
two
underlying
probability
values
(
e
.
g
.,
the
proportion
of
agreement
and
the
proportion
of
disag
re
ement
).
\n
Here
'
s
how
to
calculate
the
confidence
interval
using
the
B
eta
Distribution
:
\n
a
.
Calcul
ate
the
proportion
of
agreement
(
p
)
and
the
proportion
of
disag
re
ement
(
q
)
between
the
two
annot
ators
using
the
annotation
agreement
value
.
\n
p
=
An
notation
Ag
re
ement
\n
q
=
1
-
p
\n
b
.
Calcul
ate
the
beta
values
(
β
1
and
β
2
)
using
the
following
formulas
:
\n
β
1
=
(
p
\
*
(
1
-
p
))
/
(
q
\
*
(
1
-
q
))
\n
β
2
=
(
q
\
*
(
1
-
q
))
/
(
p
\
*
(
1
-
p
))
\n
c
.
Calcul
ate
the
confidence
interval
using
the
B
eta
Distribution
function
:
\n
Conf
idence
Inter
val
=
B
(
β
1
,
β
2
)
\n
Where
B
(
β
1
,
β
2
)
is
the
B
eta
Distribution
function
with
parameters
β
1
and
β
2
.
\n
For
example
,
if
the
annotation
agreement
was
0
.
3
,
the
proportion
of
agreement
was
0
.
3
,
and
the
proportion
of
disag
re
ement
was
0
.
7
,
the
beta
values
would
be
β
1
=
0
.
4
3
and
β
2
=
0
.
5
7
.
Using
these
values
,
you
could
calculate
the
confidence
interval
as
follows
:
\n
Conf
idence
Inter
val
=
B
(
0
.
4
3
,
0
.
5
7
)
\n
This
confidence
interval
represents
the
range
of
values
within
which
the
true
inter
-
annot
ator
agreement
is
likely
to
fall
.
\n
4
.
Inter
pret
the
Results
:
\n
Fin
ally
,
interpret
the
results
of
your
inter
-
annot
ator
agreement
analysis
.
The
confidence
interval
provides
a
range
of
values
within
which
the
true
inter
-
annot
ator
agreement
is
likely
to
fall
.
You
can
use
this
information
to
assess
the
reli
ability
of
your
annot
ators
and
to
make
dec
isions
about
how
to
proceed
with
your
machine
learning
model
.
\n
For
example
,
if
the
confidence
interval
is
wide
(
e
.
g
.,
9
5
%
confidence
interval
),
it
may
indicate
that
the
annot
ators
have
different
interpret
ations
of
the
data
,
and
you
may
need
to
re
-
annot
ate
the
examples
or
use
multiple
annot
ators
to
improve
the
agreement
.
On
the
other
hand
,
if
the
confidence
interval
is
narrow
(
e
.
g
.,
9
9
%
confidence
interval
),
it
may
indicate
that
the
annot
ators
have
a
high
level
of
agreement
,
and
you
can
use
their
labels
to
train
your
machine
learning
model
.
\n
That
'
s
it
!
By
following
these
steps
,
you
can
compute
inter
-
annot
ator
agreement
,
including
confidence
intervals
,
for
your
3
5
0
annot
ated
examples
.
</s>
Legend: Shifted positions |
Marginal positions |
Unshifted positions
Token Distributions