# HackerRank - Task Scheduling

11 Feb 2013Recently I was leaning Python by going through the problems at HackerRank.com, it original was the interviewstreet, at which I had submitted some codes, they upgrade the site but didn't provide code backups for downloading, that upset me :(, however, something good there is, it makes me have to re-code the Python version without any existed C++ codes.

I'm thinking about posting analysis for some problems I solved, they could be interesting, tricky, or successfully frustrating me too much that I won't let it go easily :)

Here is the link to the problem **Task Scheduling**.

There is a straight forward thought, that is the tasks with smaller deadlines should be completed asap, it can be proved by contradiction.

[I think the description of the problem is just trying to mislead people(?), there is no switch needed]

Let $t_1$,$t_2$ be two tasks with deadline $d_1$ , $d_2$ and time cost $m_1$ , $m_2$, $t_1$ is finished at time $c_1$, and $t_2$ is finished at time $c_2$, instantly, we have $c_2 = m_1+m_2$.

**Statement**: if $d_1 < d_2$ and $t_1$ finished after $t_2$, the max overshoot might be less than that when $t_1$ finished before $t_2$

Proof:

$ts_1$ is the max overshoot if $t_1$ finished first, $ts_2$ is that otherwise.

\begin{eqnarray}

ts_1 = max(c_1 - d_1 , c_2 - d_2) \\

ts_2 = max(c_3 - d_2 , c_4 - d_1)

\end{eqnarray}

where $c_2 = c_4 = m_1 + m_2$

since $c_3 < c_4$ and $d_2 > d_1$, then $ts_2 = c_4 - d_1$

as the statement, to have $ts_2 < ts_1$, things need to satisfy:

\begin{eqnarray}

c_4 - d_1 < c_1 - d_1 \\

or\\

c_4 - d_1 < c_2 - d_2

\end{eqnarray}

which is

\begin{eqnarray}

m_1 + m_2 < c_1 \\

or\\

d_1 > d_2

\end{eqnarray}

and none of above could be true, therefore, we have the contradiction.

The basic solution is to just sort the list and do the calculation, however, what they ask is not just one final answer, but $n$ answers with dynamically increasing set of tasks, this turns the problem into an interval query'n'update problem, what my solution is:

Sort the whole set of tasks first, and each of them have its rank, given the order of input, put the tasks input the spot according to its rank $i$, then the answer would be:

$max\Big\{ \quad max\{S_p - d_p | p \in [1 , i)\} ,\quad S_i - d_i , \quad max\{ S_q - d_q | q \in (i , n]\} \Big\}$

where $S_i = \sum_{k=1}^{i}m_k$

I used Binary Indexed Tree to deal with $S_i$, the update and query can be done in $O(log_n)$, this is not too complicated, the more difficult thing for me is to update and query the part in the back of $i$, though I know a kind of Segment Tree is capable to do this in $O(log_n)$, which is acceptable, I haven't write trees with those operations before, it took me some time to figure out how does it work, and after I finished a working code with c++ and was trying to transfer it to Python, the running efficiency of Python was like a nightmare(about 50 times slower than c++…), I spent hours to optimize the Python codes to make it finally barely accepted at the HackerRank, (well, kinda meaningless though…you don't have to eat steak with Python)

Here is Python code which is the slower version but easier for reading:

I call it P++, lol.

```
import sys
import math
stdin = sys.stdin
Mid= lambda x : ((x[0] + x[1])>>1)
Lowbit = lambda x : (x&(-x))
used = {}
class Task:
def __init__(self , _d , _m , _index):
self.d = int(_d)
self.m = int(_m)
self.index = int(_index)
def __lt__(self , other):
if (self.d == other.d):
return self.m > other.m
else:
return self.d < other.d
def __repr__(self):
return str((self.d , self.m))
class Node:
def __init__(self):
self.val = 0
self.idx = 0
self.add = 0
self.cover = (0 , 0)
self.lChild = None
self.rChild = None
def __lt__(self , other):
if (self.idx not in used):
return True
if (other.idx not in used):
return False
return self.val < other.val
def __repr__(self):
return str(self.val)
def build(self , cover):
self.cover = cover
self.idx = cover[0]
if (cover[0] == cover[1]):
return
mid = Mid(cover)
self.lChild = Node()
self.rChild = Node()
self.lChild.build( (cover[0] , mid) )
self.rChild.build( (mid + 1 , cover[1]) )
def addDown(self):
self.lChild.addValue(self.add)
self.rChild.addValue(self.add)
self.add = 0
def addValue(self , add):
self.val += add
self.add += add
def update(self , cover, add):
if (cover[0] <= self.cover[0] and cover[1] >= self.cover[1]):
self.addValue(add)
return
if (self.add):
self.addDown()
mid = Mid(self.cover)
if (cover[0] <= mid):
self.lChild.update(cover , add)
if (cover[1] > mid):
self.rChild.update(cover , add)
ret = max( self.lChild , self.rChild )
self.val = ret.val
self.idx = ret.idx
def query(self , cover):
if (cover[0] == self.cover[0] and cover[1] == self.cover[1]):
return self
if (self.add):
self.addDown()
mid = Mid(self.cover)
if (cover[1] <= mid):
return self.lChild.query(cover)
elif (cover[0] > mid):
return self.rChild.query(cover)
else:
vl = self.lChild.query( (cover[0] , mid) )
vr = self.rChild.query( (mid + 1 , cover[1]) )
return max(vl , vr)
class BinIdxTree:
def __init__(self , size):
self.S = [0 for i in range(size+10)]
def add(self , x , c):
while (x <= n):
self.S[x] += c
x += Lowbit(x)
def getsum(self , x):
res = 0
while (x > 0):
res += self.S[x]
x -= Lowbit(x)
return res
line = stdin.readline()
n = int(line)
segTreeRoot = Node()
segTreeRoot.build((1 , n))
bTree = BinIdxTree(n)
taskList = []
rank = [0 for i in range(n+1)]
for i in range(n):
line = stdin.readline()
[d , m] = [int(x) for x in line.split()]
t = Task(d , m , i)
taskList.append(t)
taskList.sort()
for i in range(n):
rank[ taskList[i].index ] = i
ans = 0
for i in range(n):
p = rank[i] + 1
bTree.add(p , taskList[p-1].m)
used[p] = True
segTreeRoot.update((p , p) , - taskList[p-1].d)
segTreeRoot.update((p , n) , taskList[p-1].m)
res = bTree.getsum(p) - taskList[p-1].d
res = 0 if res < 0 else res
if (p + 1 <= n):
resNode = segTreeRoot.query((p + 1 , n))
p = resNode.idx
if (p in used):
s =bTree.getsum(p) - taskList[p-1].d
res = s if s > res else res
ans = res if res > ans else ans
print ans
```