Information for members of EIO organizing team

The main activities of EIO organizing committee are:

  1. Creating tasks for various contests, and organizing these contests.
  2. Organizing workshops ("õppesessioonid") for students to learn about contest programming.

Tasks and contests

There are 4 main contests during the school year:

Total of 24 tasks per season.

Task creation process

Our process is following:

Creating a good task

For beginners, it is fine to use any creative programming task (example: count valid moves for a chess piece).

For advanced tasks, anything covered by IOI syllabus is fair game. Usually, there would be a brute-force method of solving the problem, and a clever algorithm method. The best tasks have multiple tiers of possible solutions (e.g. O(2^N), O(N^3), O(N*logN) complexity).

"Implement a well-known standard algorithm" is not a good task. A good task should either have an interesting and somewhat original solution idea or a "natural" statement.

Preparing the task

So, you had a good task idea and the jury decided to use it in a contest. Now it's time to prepare it.

The folder of each task in the contest repository has the following structure:

Creating tests

Make your tests as thorough as possible, covering various special cases, minimum/maximum inputs etc. If your task has multiple possible solutions with different complexity, make sure the scoring reflects it.

For example, if there's a brute force solution and a clever solution then a good rule of thumb is to give 50% of points for the brute-force (but correct) solution, and 50% for the fast solution. If there's a slow, medium and fast solution, you can do 40%-30%-30% etc.

GroupMin and GroupSum

Test cases are generally divided into subtasks. There are two basic types of scoring systems:

In some tasks you might want to give partial credit for a test case. For example, if the task is "find the shortest path in a graph", you might want to give half of the points if the length of the path is correct and the other half if the path itself is also correctly printed in the output. In such cases, similar logic applies to GroupSum and GroupMin. Suppose a contestant correctly calculates the length of the path in every test case of a subtask, but outputs the wrong path in one test case. In case of GroupSum, they will get full credit for all test cases except for the one with the mistake. In case of GroupMin, they will only get half of the points for the entire subtask. Whether the task uses GroupSum or GroupMin is defined in task.yaml.

In easier tasks (tasks 1, 2 and possibly 3 of a typical contest), we usually use GroupSum. These tasks are aimed at beginners and we try to be friendly. However, there are some cases in which GroupSum can't be used. For example, if the task has a yes/no answer (or the output format is "yes, here's a solution: ..."/"no, there is no solution"), then a contestant can potentially get a large number of points by submitting a program that simply outputs "no" in every case. GroupSumCond is designed precisely for this case; it divides groups into unconditional and conditional ones: the scoring for unconditional groups works the same as in GroupSum, but the scores for conditional groups are only awarded to programs scoring at least some points for unconditional groups.

In harder tasks however, GroupMin is strongly recommended. Why?

If you use GroupMin, also document that in the problem statement's scoring section. There is a standard text we use ("Selles ülesandes on testid jagatud gruppidesse...").

The GEN file

The grouping of tests to subtasks is defined in the gen/GEN file. In its simplest form, it looks like this:

# ST: 0
0
1
# ST: 2
2
# ST: 4
3
4
# ST: 20
5
6
7
8
9
# ST: 4
10
11

There are 12 test cases in this problem, divided into 4 subtasks, worth 2, 4, 20 and 4 points, respectively. For example, the subtask worth 20 points consists of tests 5...9. Tests 0 and 1 are the examples shown in the task statement. Since these tests do not give points (but we still want to judge these on the server), they are in a separate subtask, worth 0 points.

However, this is not very good because it doesn't tell us where the tests came from. In most cases, you probably aren't typing up the test cases by hand, especially since it is common to have test cases with some 100 000 lines of data. It's a good idea to put these test generation scripts in the repository as well: if there are issues with the test cases and you aren't available, others can figure out where the problems are coming from and fix them. And usually, GEN is the perfect place to document which generation scripts and which parameters correspond to which test cases. The syntax of GEN allows to use it as a shell script. Here is an example from a recent contest.

# ST: 0
# Examples
touch ../input/input0.txt
touch ../input/input1.txt
touch ../input/input2.txt
# ST: 20
# 1-bigk
touch ../input/input3.txt
../bin/gen_simple -n=10 -m=15 -wnx=3 -K=10 > ../input/input4.txt
../bin/gen_simple -n=10 -m=20 -wnx=3 -K=10 > ../input/input5.txt
(some lines omitted)
../bin/gen_simple -n=1000 -m=300000 -wnx=1 -K=1000 > ../input/input10.txt
../bin/gen_dense -n=1000 -m=300000 > ../input/input11.txt
../bin/gen_simple -n=100000 -m=300000 -wnx=200 -K=100000 > ../input/input12.txt
../bin/gen_simple -n=300000 -m=300000 -wnx=200 -K=300000 > ../input/input13.txt
# ST: 20
# 2-20
../bin/gen_simple -n=7 -m=8 -wnx=3 -K=1 > ../input/input14.txt
(some lines omitted)
../bin/gen_dense -n=20 -m=150 > ../input/input20.txt
# ST: 20
# 3-1000
../bin/gen_simple -n=900 -m=1000 -wnx=10 -K=3 > ../input/input21.txt
../bin/gen_simple -n=990 -m=1000 -wnx=10 -K=10 > ../input/input22.txt
../bin/gen_simple -n=999 -m=1000 -wnx=10 -K=20 > ../input/input23.txt
../bin/gen_dense -n=45 -m=980 > ../input/input24.txt
# ST: 40
# 4-full
../bin/gen_simple -n=100000 -m=300000 -wnx=100 -K=1 > ../input/input25.txt
../bin/gen_simple -n=100000 -m=300000 -wnx=100 -K=10 > ../input/input26.txt
(some lines omitted)
../bin/gen_simple -n=299900 -m=300000 -wnx=100 -K=1000 > ../input/input36.txt
../bin/gen_dense -n=1000 -m=300000 again > ../input/input37.txt
../bin/comet -n=300000 -head=100 -K=10 > ../input/input38.txt
../bin/comet -n=300000 -head=100 -K=100 > ../input/input39.txt

Every line corresponds to a test case. In most cases, we execute a generation script and pipe the input to the corresponding input file. Some test cases are handwritten, those we simply touch. Comments not starting with # ST: can be used as regular comments.

In tasks using the GroupSumCond scoring methods, the # ST: headings have an additional flag: # ST: 30 U is an unconditional group worth 30 points, # ST: 20 C is a conditional group worth 20 points (for which any score is awarded only to programs that get at least some points for unconditional groups), # ST: 0 E is used for the examples.

Advice for writing strong tests

Implement every wrong heuristic you can think of and make sure all of them fail. If your solution has many special cases, try commenting out each case and check that the solution fails.

Randomize all numbers that can be randomized. For example, when generating a graph, randomize the labels of the vertices, the order in which edges are given and the order of vertices in each edge.

In some cases you might want to consider putting multiple test cases in the same file, as is common in university competitions. In the task statement, it might look something like this: "The first line of the input contains a single integer T (1 <= T <= 1000): the number of test cases. Each test case is described as follows. (...) It is guaranteed that the sum of N over all test cases doesn't exceed 100 000." It may prove to be very hard to fail all possible wrong solutions (even solutions that consist of total nonsense) using only a dozen or so test cases. Putting multiple test cases in a file allows you to use thousands of test cases; a wrong solution will almost surely have a wrong output on at least one of them. This can't be used in all tasks however:

Creating custom checkers

A checker is a program that consumes the input file, the jury's output file and the contestant's output file and decides how many points should be given. If the correct output is unique in all test cases, you don't need to write a checker. However, if there are multiple possible correct outputs, we need a custom checker to assess the solution's correctness.

testlib.h

testlib.h is a header-only C++ library for preparing programming contest problems. It is particularly useful for writing checkers, but it can also be used for generators. We have a variant of testlib.h suitable for use in our CMS.

Why use it?

Sample contest timeline

Workshops

Workshops usually have 6 academic hours of work per day. Ideally, about 40% of time would be spent by the lecturer, explaining the material, and 60% would be spent on practical work. Seek out sample problems for the topics; https://codeforces.com/ and https://uva.onlinejudge.org/ are good resources.

Below is a sample division of topics between the workshops. This is not set in stone; bits and pieces may be shifted around. The "UVA" tag denotes task numbers at https://uva.onlinejudge.org/

Traditionally, we have organized the following workshops:

See also: https://people.ksp.sk/~misof/ioi-syllabus/ioi-syllabus.pdf

Lehekülg viimati muudetud November 19, 2023, at 06:56 AM