MIPS | Malaysian Informatics And Programming Society

P1: Building Fences

Solution

Go through each column and count the number of grass cells. The answer is the minimum number of grass cells in a column.

Code
  • n, m = [int(x) for x in input().split()]
    grid = [input() for _ in range(n)]
    grass_col_count = [0] * m
    for i, row in enumerate(grid):
        for j, cell in enumerate(row):
            if cell == ".":
                grass_col_count[j] += 1
    print(min(grass_col_count))
    
  • #include <bits/stdc++.h>
    using namespace std;
    
    int main() {
        int n, m;
        cin >> n >> m;
        vector<string> s(n);
        for (auto& i : s) cin >> i;
    
        int minv = n;
        for (int j = 0; j < m; j++) {
            int cnt = 0;
            for (int i = 0; i < n; i++) {
                if (s[i][j] == '.') cnt++;
            }
            minv = min(minv, cnt);
        }
        cout << minv << "\n";
        return 0;
    }
    

P2: Fans

Solution

Let s=c1++cNs = c_1 + \cdots + c_N and m=maxcim = \max c_i. If smm1s - m \ge m - 1, the answer is ss. Otherwise, the answer is 2(sm)+12(s - m) + 1.

Intuition

The only time we cannot use all sticks is when we have too many sticks of the same colour, forcing us to put two of them together. Suppose colour 11 is the most frequent stick colour. Let s=c1+c2++cNs = c_1 + c_2 + \cdots + c_N be the total number of sticks. Then, the “worst” sequence we can make is (1,?,1,,?,1)(1, ?, 1, \ldots, ?, 1) (where ??s are not 11), where colour 11 appears one more time than the total number of other colours, i.e. c11=sc1c_1 - 1 = s - c_1.

So, intuitively:

  • If c11sc1c_1 - 1 \le s - c_1, we can use all sticks and the answer is ss.
  • Otherwise, we reduce colour 1 to be one more than sc1s - c_1, so the answer is (sc1+1)+(sc1)=2(sc1)+1(s - c_1 + 1) + (s - c_1) = 2(s - c_1) + 1.

Formal proof

Suppose we have s=c1+c2++cNs = c_1 + c_2 + \cdots + c_N sticks in total. Since the order of groups does not matter, assume c1c_1 is the maximum cic_i.

Consider the easier question: When is it (not) possible to use all ss sticks? If color ii appears cic_i times, to avoid two adjacent color ii sticks, we must have at least ci1c_i - 1 sticks of other colors between them. Mathematically, we must have scici1s - c_i \ge c_i - 1 for all ii. It is enough if the condition holds for the max cic_i, i.e. sc1c11s - c_1 \ge c_1 - 1.

If sc1c11s - c_1 \ge c_1 - 1 holds, does that mean we can always use all ss sticks? The answer is yes. Put all c1c_1 colour 1 sticks in a row. This separates the row into c1+1c_1 + 1 sections.

  • For each of the other colours ii, since cic1<c1+1c_i \le c_1 < c_1 + 1, we can put the cic_i sticks in different sections, guaranteeing that they are not adjacent.
  • We also need to ensure the colour 1 sticks are separated. We need c11c_1 - 1 other sticks to separate the c1c_1 colour 1 sticks. Since sc1c11s - c_1 \ge c_1 - 1, we have enough sticks to fill in all gaps.

Back to the original problem, we can reduce it to: “What is the max number of sticks we can take such that smm1s - m \ge m - 1 where m=maxcim = \max c_i?”

Observation: In an optimal solution, at most one group of sticks is not fully taken.

  • Proof: Suppose there are two groups of sticks not fully taken with colours aa and bb, and the optimal sequence of stick colours is (x1,x2,,xK)(x_1, x_2, \ldots, x_K). If x1bx_1 \ne b, a longer valid sequence is (a,b,x1,,xK)(a, b, x_1, \ldots, x_K). Otherwise, x1=bax_1 = b \ne a, so a longer valid sequence is (b,a,x1,,xK)(b, a, x_1, \ldots, x_K). In both cases, we can make a longer sequence, contradiction.

Therefore, if we cannot take all sticks (sm<m1s - m < m - 1), we should only remove sticks of one colour, and we must reduce mm by removing the most frequent colour.

In conclusion, if smm1s - m \ge m - 1 already holds, the answer is ss. Otherwise, we reduce mm to sm+1s - m + 1, which gives the answer (sm+1)+(sm)=2(sm)+1(s - m + 1) + (s - m) = 2(s - m) + 1.

Code
  • def solve():
        _n = int(input())
        c = [int(x) for x in input().split()]
        c_max = max(c)
        c_sum = sum(c)
        if c_sum - c_max >= c_max - 1:
            print(c_sum)
        else:
            print((c_sum - c_max) * 2 + 1)
    
    
    t = int(input())
    for _ in range(t):
        solve()
    
  • #include <bits/stdc++.h>
    using namespace std;
    
    int main() {
        int t;
        cin >> t;
        while (t--) {
            int n;
            cin >> n;
    
            long long maxv = 0, sum = 0;
            for (int i = 0; i < n; i++) {
                long long a; cin >> a;
                maxv = max(maxv, a);
                sum += a;
            }
    
            if (maxv >= sum - maxv + 1) {
                cout << (sum - maxv) * 2 + 1 << "\n";
            } else {
                cout << sum << "\n";
            }
        }
        return 0;
    }
    

P3: Trick or Treat

Solution

Task 1–3 (N500N \le 500)

Try every possible triple of candy packs i,j,ki, j, k (1i<j<kN1 \le i < j < k \le N) and check whether the condition is satisfied. Tasks 1–2 are small enough to be brute-forced by hand.

Time complexity: O(N3)O(N^3).

Task 4 (N5000N \le 5000)

From this point onward, since the order of colors within a pack does not matter, we normalize each pack by ensuring aibia_i \le b_i (swapping the values if necessary). This makes counting and lookup more convenient.

Instead of checking all triples of packs directly, observe that once we choose the first two packs, the colors required for the third pack are completely determined by the condition. Therefore, we iterate over all pairs of packs ii and jj with i<ji < j, compute the colors that the third pack must have, and then add the number of remaining packs matching those colors to our total. Each valid triple is counted three times through this process, so we divide the final count by 3.

To quickly find how many packs have a specific color pair, we store the frequency of each normalized pair in a hash map (a dict in Python or std::unordered_map in C++).

Time complexity: O(N2)O(N^2).

Task 5 (ai,bi2a_i, b_i \le 2)

When colors are restricted to 11 and 22, there are only two possible valid triples (up to ordering of the packs): (1,1),(1,2),(2,2)(1, 1), (1, 2), (2, 2) and (1,2),(1,2),(1,2)(1, 2), (1, 2), (1, 2).

Let cxyc_{xy} be the number of packs with the color pair (x,y)(x, y). The first type contributes T1=c11c12c22T_1 = c_{11} c_{12} c_{22} triples, while the second type contributes T2=(c123)=c12(c121)(c122)6T_2 = \binom{c_{12}}{3} = \frac{c_{12}(c_{12} - 1)(c_{12} - 2)}{6}.

The final answer is T1+T2T_1 + T_2.

Time complexity: O(N)O(N).

Task 6 (N200000N \le 200\,000)

Generalizing Task 5, we apply the same counting strategy to every distinct color pair (x,y)(x, y) with xyx \ne y. For each such pair, compute the number of valid triples as in Task 5 and sum them all together. Again, use a hash map to maintain frequencies efficiently.

Time complexity: O(N)O(N).

Code
  • import math
    from collections import Counter
    
    n = int(input())
    packs = [(int(x) for x in input().split()) for _ in range(n)]
    # packs = [[1, 2], [2, 3], [4, 1], ...]
    packs = [(a, b) if a <= b else (b, a) for a, b in packs]
    counter = Counter(packs)
    ans = 0
    for (a, b), count in counter.items():
        if a == b:
            continue
        ans += counter[(a, a)] * count * counter[(b, b)]
        ans += math.comb(count, 3)
    print(ans)
    
  • #include <bits/stdc++.h>
    using namespace std;
    
    int main() {
        ios::sync_with_stdio(false);
        cin.tie(0);
    
        int n = 0;
        cin >> n;
    
        unordered_map<int, long long> aa;
        map<pair<int, int>, long long> ab;
    
        for (int i = 0; i < n; i++) {
            int a, b;
            cin >> a >> b;
            if (a > b) swap(a, b);
            if (a == b) {
                if (aa.find(a) == aa.end()) aa[a] = 0;
                aa[a]++;
            }
            else {
                pair<int, int> p = make_pair(a, b);
                if (ab.find(p) == ab.end()) ab[p] = 0;
                ab[p]++;
            }
        }
        long long ans = 0;
        for (auto [p, cnt] : ab) {
            if (cnt >= 3) {
                ans += (long long)(cnt * (cnt - 1) * (cnt - 2)) / 6;
            }
    
            if (aa.find(p.first) != aa.end() && aa.find(p.second) != aa.end()) {
                ans += cnt * aa[p.first] * aa[p.second];
            }
        }
        cout << ans << "\n";
        return 0;
    }
    

P4: Word Distance

Solution

Task 1 - 3 (M=4M = 4)

The common additional constraint across these tasks is that M=4M=4. Moreover, there are only 26 different lowercase characters (‘a’ to ‘z’). Hence, we can enumerate all the possible word of length 4, which totals to 2644×10526^4 \approx 4 \times 10^5.

While enumerating each candidate word, we need to compute its distance to each given word. Observe that, by definition of anagram, the order of characters in a string does not matter. Therefore, for each string, we can count the occurences of each letter (‘a’ to ‘z’) and compute the sum of absolute difference between the occurence counts of both string. This quantity is twice actual distance between them.

To select the lexicographically smallest optimal string, we may store all candidate strings that achieve the minimum total distance, and then sort them using the standard library of the programming language of your choice. Time complexity for computing the distance is O(nm)O(nm), coming mainly from counting letter occurrences.

Key observations from these tasks:

  1. Order of characters in a string does not matter.
  2. The distance between two string is sum of absolute difference between the occurence counts of both string divide by 2.

Task 4 (Given strings are anagram)

Since the given strings are anagram, the answer is simply the the lexicographically smallest string that is an anagram to any of the given string. To solve this, note again that the order of characters does not matter. Therefore, letters that appear earlier in alphebet should be placed earlier in the final string. As a result, sorting any one of the given strings yields the correct answer.

Key observation from the tasks:

  1. Sort the string before printing the answer (for a fixed set of letter occurences, this guarantees that it is the lexicographical smallest)

Task 5 - 7 (Strings consists of only ‘a’ and ‘b’)

With the observation from Task 4, the final answer must consists of some number of ‘a’ characters followed by some number of ‘b’ characters. Formally, if the final string contains ii occurences of ‘a’, then there must be mim - i occurences of ‘b’, since its total length must be mm. This lead to the key idea: enumerate all possible values of ii (i.e. the number of character ‘a’ in the final string).

For each choice of ii, construct the corresponding candidate string (with ii ‘a’s followed by mim - i ‘b’s) and compute its total distance to the given strings using the same method applied in Tasks 1–4. Select the candidate with the minimum total distance. In case of ties, prioritize the larger value of ii.

Task 8 - 10 (No Additional Constraints)

Since there are 26 letters, method from task 5-7 does not works. There are 2 different greedy solution for these tasks. We present both solution and intuition on their correctness then follows by a rigorous proof.

Solution 1 - Adding character that minimizes the cost added

Algorithm 1

  1. Start with an empty string. While the length of the string is less than mm, do the following:
  2. Greedily add the letter cc that causes the smallest increase in total distance (breaking ties by choosing the letter that appears earlier in the alphabet). To compute the increase in total distance when adding letter cc:
    • Suppose that, among the nn given strings, there are aa strings has occurrences of letter cc less than or equal to the occurrences of cc in the partial final string.
    • These aa strings each increase the distance by 11.
    • The remaining nan - a strings each decrease the distance by 11.
    • Therefore, the total change in distance is a(na)=2ana - (n - a) = 2a - n.
  3. Once the string reaches length mm, sort the string to obtain the final answer.

Intuition for Correctness

  • As the number of occurrences of a letter in the final string increases, the value of aa (in 2an2a - n) either stays the same or increases.Thus, the marginal cost for adding another copy of the same letter does not go down.
  • Therefore, at each step, picking the letter that minimizes the increase in distance is optimal.
  • Choosing a letter with a larger immediate cost cannot give a future benefit, because its marginal cost will never drop later.
  • Hence, the greedy strategy is correct.
Solution 2 - Removing element from multiset

Algorithm 2

  1. Start with an empty string. While the length of the string is not mm, do the following:
  2. Choose the character that appears in the largest number of the given strings.
    • Formally, treat each string as a multiset of characters (each letter with its corresponding occurrence count).
    • Identify the character that appears in the most of the nn multisets.
    • Append that character to the final string.
    • Then, for each string, remove one occurrence of that character (if its occurrence is nonzero).
  3. Once the length of the string becomes mm, sort the string to obtain the final answer.

Intuition for Correctness

  • At each step, by choosing the character that appears in the most multisets, you are effectively selecting the character that decreases the total distance the most.
  • After you remove one occurrence of that character from the strings, the number of multisets containing that character can only stay the same or decrease, meaning the potential decrease in distance from choosing that character will not increase later.
  • Thus, similar to Solution 1, selecting the letter that gives the largest immediate improvement is optimal, since no delayed choice can yield a larger benefit in the future.
Proof of Correctness

The proof of correctness is split into two parts:

  1. Prove that the greedy algorithm produces a string with the minimum total distance.
  2. Prove that among all strings with minimum total distance, the greedy algorithm produces the lexicographically smallest one.

Proof of Statement 1

We prove the correctness of the greedy algorithm using an exchange argument. Assume that some optimal solution (denoted OPT) achieves the minimum possible total distance. As long as OPT differs from the greedy solution, we show how to exchange characters so that:

  • the total distance never increases, and
  • OPT becomes strictly “closer” to the greedy solution.

Eventually, OPT is transformed into the greedy solution while maintaining the same total distance, implying the greedy solution is also optimal.

Key Definitions

Fix a letter cc, and suppose it appears kk times in the partial final string.

  • Let aa be the number of given strings whose occurrence count of letter cc is at most kk.
  • Let f(a)=2anf(a) = 2a - n be the increase in total distance if we add one more occurrence of letter cc.

From the algorithm, the increase in cost depends only on aa, and since increasing the number of chosen characters only increases aa, the function ff is monotonically non-decreasing.

The Exchange Step

Let letters ii and jj be such that:

  • OPT uses fewer occurrences of ii than the greedy solution, and
  • OPT uses more occurrences of jj than the greedy solution.

Let:

  • aiOPT,aiGa^{OPT}_i, a^{G}_i be the corresponding aa-values for letter ii in OPT and greedy;
  • ajOPT,ajGa^{OPT}_j, a^{G}_j be the corresponding values for letter jj.

Because OPT has fewer ii’s than greedy, aiOPTaiG1a^{OPT}_i \le a^{G}_i - 1.

Similarly, because OPT has more jj’s than greedy, ajGajOPT1a^{G}_j \le a^{OPT}_j - 1.

Since ff is monotonic, f(aiOPT)f(aiG1)f(a^{OPT}_i) \le f(a^{G}_i - 1) and f(ajG)f(ajOPT1)f(a^{G}_j) \le f(a^{OPT}_j - 1).

The greedy algorithm chose ii over jj, so f(aiG1)f(ajG)f(a^{G}_i - 1) \le f(a^{G}_j).

Combining these inequalities gives: f(aiOPT)f(ajOPT1)f(a^{OPT}_i) \le f(a^{OPT}_j - 1).

Now modify OPT by replacing one occurrence of jj with one occurrence of ii. Let the new solution be OPT′. The change in cost is: f(aiOPT)f(ajOPT1)0.f(a^{OPT}_i) - f(a^{OPT}_j - 1) \le 0.

Thus, the total distance does not increase.

By repeatedly applying such exchanges, OPT can be transformed into the greedy solution without increasing its cost. Since OPT already had the minimum possible total distance, the greedy solution must also have the minimum total distance.

Reduction of Solution 2 to Solution 1

In Solution 2, if a character cc appears at least once in a multiset, it means the occurrence count in that string is strictly greater than the count of cc in the partial final string. This corresponds exactly to the quantity nan - a in Solution 1. Thus, the change in cost is a(na)=2ana - (n - a) = 2a - n, the same expression used in Solution 1.

Therefore, Solution 2 is simply a reformulation of Solution 1, and the correctness of Solution 1 immediately implies the correctness of Solution 2.

Proof of statement 2

This part is simpler. Define the cost of a solution as:

Cost=g×26×n  +  cs(ca),\text{Cost} = g \times 26 \times n \;+\; \sum_{c \in s} (c - 'a'),

where:

  • gg is the total distance of the solution, and
  • ss is the resulting string.

Note that the floor of Cost module divide 26n26n is the total distance because the sum of each characters (order appears in alphebet) in final string is at most 26n26n. This is why we multiply the distance by 26n26n, which gives a large enough “gap” between distance levels, minimizing the total distance dominates any lexicographic concerns. The minimizing the secondary term cs(ca)\sum_{c \in s}(c - 'a') is equilvalent to finding the lexicographical smallest string.

Thus, the greedy algorithm—by always choosing the lexicographically smallest option among ties—minimizes the full cost and therefore outputs the lexicographically smallest string among all optimal (minimum-distance) strings.

Code (C++, Solution 1)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;
    vector< vector<int> > cnt(n, vector<int>(26));
    for (int i = 0; i < n; i++) {
        string s;
        cin >> s;
        for (char c : s) {
            cnt[i][c - 'a']++;
        }
    }

    string ans = "";

    priority_queue< tuple<int, int, int, int> > pq;
    for (int c = 0; c < 26; c++) {
        int total_occ = 0;
        int morethan = 0;
        for (int i = 0; i < n; i++) {
            if (cnt[i][c] > total_occ) {
                morethan++;
            }
        }

        pq.emplace(morethan - (n - morethan), -1 * c, morethan, total_occ);
    }

    while (ans.size() < m) {
        auto [_, c, morethan, total_occ] = pq.top(); pq.pop();

        ans += (char)('a' + (-1) * c);

        total_occ++;
        for (int i = 0; i < n; i++) {
            if (cnt[i][c * -1] == total_occ) morethan--;
        }
        pq.emplace(morethan - (n - morethan), c, morethan, total_occ);
    }

    sort(ans.begin(), ans.end());

    cout << ans << "\n";

    return 0;
}
Code (C++, Solution 2)
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(0);
    int n, m;
    cin >> n >> m;
    vector<string> strings(n);

    for (int i = 0; i < n; i++) {
        cin >> strings[i];
    }

    vector<vector<int>> char_cnt(n, vector<int>(26, 0));

    for (int i = 0; i < n; i++) {
        for (char c : strings[i]) {
            char_cnt[i][c - 'a']++;
        }
    }

    string ans = "";

    for (int j = 0; j < m; j++) {
        int maxc = -1;
        int arg = -1;
        for (int c = 0; c < 26; c++) {
            int cnt = 0;
            for (int i = 0; i < n; i++) {
                if (char_cnt[i][c] != 0) cnt++;
            }

            if (cnt > maxc) {
                maxc = cnt;
                arg = c;
            }
        }

        ans += ('a' + arg);

        for (int i = 0; i < n; i++) {
            if (char_cnt[i][arg] != 0) char_cnt[i][arg]--;
        }
    }

    sort(ans.begin(), ans.end());
    cout << ans << "\n";
    return 0;
}

P5: Reachability Queries

Solution

Group 1: No road is damaged

The problem setup can be modelled as a graph, where each city is a node and each road is an edge with weight equal to its length. Let GG be the graph of the Magical Charming Conurbation before the apocalypse, and GG' be the graph after the apocalypse.

Since no roads are damaged during the apocalypse, the answer will be the same for all queries. This answer is the number of cities reachable from city 11 regardless of distance. Hence, we can perform a depth first search (DFS) or breadth first search (BFS) on graph GG to find the size of the connected component containing city 11, which is our answer.

Time complexity: O(N+M)O(N+M)

Group 2: N,M,Q5000N, M, Q \leq 5000

We answer each query independently. For each query, we can run a single-source shortest path algorithm (e.g. Dijkstra’s algorithm) from node 11 on graph GG. Let AA be the set of nodes that can be reached from node 11 in under distance KK.

Now, a node is reachable if and only if it can be reached from one of the nodes in AA by travelling on the edges in GG'. We can check how many nodes are reachable by finding which component each node belongs to in GG' (using DFS, BFS or Disjoint Set Union (DSU)), and checking if each node belongs to a component that contains a node which is in AA.

Time complexity: O(QNlogM)O(QN \log M)

Group 3: No additional constraints

Sort the queries. Just like in the solution for Group 2, run a single-source shortest path algorithm from node 11 on graph GG, and also find which component each node belongs to in GG'. We create a list BB of all the nodes, sorted by their distance from node 11.

For each query, find the reachable nodes within distance K that weren’t reachable in the previous query by shifting a pointer on list BB (maintain the first index ii such that node BiB_i cannot be reached from node 11 within distance KK). For each of these newly reachable nodes, we check if its connected component in GG' has previously already been reachable. If it wasn’t reachable previously, we add the size of its connected component to the answer.

Time complexity: O((N+M)logN+Q)O((N+M) \log N + Q)

Code (C++)
#include <bits/stdc++.h>
using namespace std;
using ll = long long;

struct unionfind {
    vector<int> p;
    vector<int> sz;
    vector<int> chosen;
    int n;
    unionfind(int n) : n(n), p(n), sz(n, 1), chosen(n, 0) {
        iota(p.begin(), p.end(), 0);
    }

    int find(int x) {
        return (p[x] == x) ? x : (p[x] = find(p[x]));
    }

    void unite(int x, int y) {
        x = find(x);
        y = find(y);
        if (x != y) {
            p[x] = y;
            sz[y] += sz[x];
        }
    }
};

const long long INF = 1e9;
int main() {
    int n, m;
    cin >> n >> m;

    vector<tuple<int, int, int>> edges(m);
    for (auto& e : edges) {
        int u, v, w;
        cin >> u >> v >> w;
        u--, v--;
        e = make_tuple(u, v, w);
    }

    int s; cin >> s;
    vector<int> re(s, 0);
    for (auto& i : re) {
        cin >> i;
        i--;
    }
    sort(re.begin(), re.end());

    int idx = 0;
    unionfind dsu(n);
    vector< vector<pair<int, long long>> > g(n);
    for (int i = 0; i < m; i++) {
        int u, v, w;
        tie(u, v, w) = edges[i];
        g[u].emplace_back(v, w);
        g[v].emplace_back(u, w);
        if (idx != s && re[idx] == i) {
            idx++;
        }
        else {
            dsu.unite(u, v);
        }
    }

    int q; cin >> q;
    vector<pair<long long, int> > queries(q);
    for (int i = 0; i < q; i++) {
        long long k;
        cin >> k;
        queries[i] = {k, i};
    }
    sort(queries.begin(), queries.end());

    idx = 0;

    vector<long long> ans(q, 0);

    vector<long long> dist(n, INF * INF);
    priority_queue<pair<long long, int>, vector<pair<long long, int>>, greater<pair<long long, int>>> pq;
    pq.emplace(0, 0);
    dist[0] = 0;
    long long cur_ans = 0;
    while (!pq.empty()) {
        auto [d, u] = pq.top(); pq.pop();
        if (dist[u] != d) continue;

        while (idx != q && d > queries[idx].first) {
            ans[queries[idx].second] = cur_ans;
            idx++;
        }

        if (dsu.chosen[dsu.find(u)] == 0) {
            cur_ans += dsu.sz[dsu.find(u)];
            dsu.chosen[dsu.find(u)] = 1;
        }

        for (auto [v, w] : g[u]) {
            if (dist[v] > d + w) {
                dist[v] = d + w;
                pq.emplace(dist[v], v);
            }
        }
    }
    while (idx != q) {
        ans[queries[idx].second] = cur_ans;
        idx++;
    }

    for (auto& x : ans) {
        cout << x << "\n";
    }
}

P6: Increasing Subsequence Median Sum

Solution

Subtask 1

The only increasing subsequence are of length 1 (the individual elements in the array), so the answer is the sum of all elements in AA.

Time complexity: O(N)O(N)

Subtask 2

Loop through all the 2N2^N subsequences of AA and check if each of these subsequences is good in O(N)O(N).

Time complexity: O(N2N)O(N \cdot 2^N)

Subtask 3

The only possible increasing subsequences with length greater than 1 will be of the form [1,2,3][1,2,3]. So we just need to count the number of subsequences of AA that are equal to [1,2,3][1,2,3]. For each index ii, we find the number of 1s to the left of it (inclusive) LiL_i, and the number of 3s to the right of it (inclusive) RiR_i. This can be done in O(N)O(N) time. Now for every index ii such that Ai=2A_i = 2, add 2LiRi2L_iR_i to the answer. Don’t forget the 1-length subsequences, just add the sum of elements of AA to the answer.

Time complexity: O(N)O(N)

Subtask 4

Consider counting the contribution of each median to the answer (this hints at the general solution). We iterate over each element at index i as the median, then we iterate over k=1k = 1 to min(i,Ni1)\min(i, N-i-1), and add (ik)(Ni1k)\binom{i}{k} \binom{N-i-1}{k} to the answer. Here kk represents the number of elements to either side of the median element. Finally, just add A_i * contribution of i to the answer. Bonus: In fact, rewriting as (ik)(Ni1k)\binom{i}{k} \binom{N-i-1}{k} as (ik)(Ni1Ni1k)\binom{i}{k} \binom{N-i-1}{N-i-1-k}, the contribution of element i is simply (N1min(i,ni1))\binom{N-1}{\min(i, n-i-1)}. Therefore, we can solve this subtask in O(N)O(N) time.

Time complexity: O(N2)O(N^2) or O(N)O(N)

Subtask 5

Similar to subtask 4, consider counting the contribution of each median. Let dp(i,j)dp(i, j) be the number of increasing subsequences of length j ending at index i.

Base case: dp(i,1)=1dp(i, 1) = 1.

Transition: Add dp(k,j1)dp(k, j-1) to dp(i,j)dp(i,j) for all k<i,Ak<Aik < i, A_k < A_i.

This dp table can be computed in O(N3)O(N^3) time.

We can define dp2(i,j)dp2(i,j) as the number of increasing subsequences of length jj starting at index ii. This table can be generated in a similar manner to dpdp.

Now the contribution of the element at index i is the sum of dp(i,k)dp2(i,k)dp(i, k) \cdot dp2(i, k) for all kk satisfying 1kmin(i,Ni1)+11 \le k \le \min(i, N-i-1) + 1.

Time complexity: O(N3)O(N^3)

Subtask 6

Notice that the bottleneck of our naive DP solution is the O(N)O(N) transition. We can use NN Fenwick trees, where each Fenwick tree corresponds to a value of jj. At index ii of the array stored in the Fenwick tree, the sum of dp(k,j)dp(k, j) for all kk such that Ak=iA_k = i is stored. We can now calculate the transition in O(logN)O(\log N) time, by querying the prefix sum up to index Ai1A_i-1 in Fenwick tree j1j-1 when calculating dp(i,j)dp(i,j).

Time complexity: O(N2logN)O(N^2 \log N)

Space complexity: O(Nmax(Ai))O(N \max(A_i))

Subtask 7

We can perform coordinate compression on the values of AA, then apply our solution to Subtask 6.

Time complexity: O(N2logN)O(N^2 \log N)

Space complexity: O(N2)O(N^2)

Footnote on constant factors

When writing the model solution for this problem, the initial code took 50 seconds to run although it had a time complexity of O(N2logN)O(N^2 \log N), running even slower than some O(N3)O(N^3) solutions. After some constant optimization I managed to get it to run in 1.5 seconds. Here are the optimizations that caused a significant speedup, from most to least significant:

  • For C++, turning on -O2 optimization when compiling.
  • Using std::unordered_map for coordinate compression and precomputing the compressed array instead of compressing on the spot when needed. This reduced the number of map accesses from O(N2)O(N^2) to O(N)O(N).
  • The second dimension of the dp array only needs to have size N2+1\frac{N}{2}+1 instead of N+1N+1
  • When computing the modulo some places, we can do a -= mod instead of a %= mod. This is quicker as the subtraction operation is faster than the modulo operation.
Code (C++, Subtask 5)
#include <bits/stdc++.h>
using namespace std;

const long long MOD = 998244353;
int dp[8001][8001];
int dp2[8001][8001];

int main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0), cout.tie(0);
    int n;
    cin >> n;

    vector<int> a(n);
    for (auto& i : a) cin >> i;

    memset(dp, 0, sizeof(dp));

    for (int i = 0; i < n; i++) dp[1][i] = 1;
    for (int len = 2; len < n; len++) {
        for (int i = len - 1; i < n; i++) {
            for (int j = len - 2; j < i; j++) {
                if (a[j] < a[i]) {
                    dp[len][i] += dp[len - 1][j];
                    if (dp[len][i] >= MOD) dp[len][i] -= MOD;
                }
            }
        }
    }

    memset(dp2, 0, sizeof(dp2));

    for (int i = 0; i < n; i++) dp2[1][i] = 1;
    for (int len = 2; len <= n; len++) {
        for (int i = n - len; i >= 0; i--) {
            for (int j = i + 1; j < n - len + 2; j++) {
                if (a[i] < a[j]) {
                    dp2[len][i] += dp2[len - 1][j];
                    if (dp2[len][i] >= MOD) dp2[len][i] -= MOD;
                }
            }
        }
    }

    // dp pre, dp2 suf
    int ans = 0;
    for (int i = 0; i < n; i++) {
        ans += a[i];
        if (ans >= MOD) ans -= MOD;

        for (int len = 1; len < n; len++) {
            if (i < len || (n - i) - 1 < len) break;

            long long left = dp[len + 1][i], right = dp2[len + 1][i];

            ans += ((left * right) % MOD) * a[i] % MOD;
            if (ans >= MOD) ans -= MOD;
        }
    }
    cout << ans << "\n";
}
Code (C++, full solution)
#include <bits/stdc++.h>
using namespace std;
#define int long long
const long long MOD = 998244353;
int dp[8001][8001];
int dp2[8001][8001];

struct FT {
    vector<int> s;
    FT(int n) : s(n) {}
    void update(int pos, int dif) { // a[pos] += dif
        for (; pos < s.size(); pos |= pos + 1) {
            s[pos] += dif;
            if (s[pos] >= MOD) s[pos] -= MOD;
        }
    }
    int query(int pos) { // sum of values in [0, pos)
        int res = 0;
        for (; pos > 0; pos &= pos - 1) {
            res += s[pos - 1];
            if (res >= MOD) res -= MOD;
        }
        return res;
    }
};

signed main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0), cout.tie(0);

    int n;
    cin >> n;
    vector<int> a(n);
    set<int> st;
    for (int i = 0; i < n; i++) {
        cin >> a[i];
        st.emplace(a[i]);
    }
    map<int, int> coor;
    for (int x : st) {
        coor[x] = coor.size() + 1;
    }
    vector<int> ori = a;
    for (int i = 0; i < n; i++) {
        a[i] = coor[a[i]];
    }

    memset(dp, 0, sizeof(dp));
    vector<FT> helper(n + 2, FT(n + 2));
    for (int i = 0; i < n; i++) {
        for (int len = 1; len <= i + 1; len++) {
            if (len == 1) {
                dp[len][i] = 1;
                helper[1].update(a[i], 1);
                continue;
            }
            dp[len][i] = helper[len - 1].query(a[i]);
            helper[len].update(a[i], dp[len][i]);
        }
    }

    memset(dp2, 0, sizeof(dp2));

    helper.assign(n + 2, FT(n + 2));
    for (int i = n - 1; i >= 0; i--) {
        for (int len = 1; len <= n - i; len++) {
            if (len == 1) {
                dp2[len][i] = 1;
                helper[1].update(a[i], 1);
                continue;
            }
            dp2[len][i] = (helper[len - 1].query(n + 1) - helper[len - 1].query(a[i] + 1) + MOD);
            if (dp2[len][i] >= MOD) dp2[len][i] -= MOD;
            helper[len].update(a[i], dp2[len][i]);
        }
    }

    // dp pre, dp2 suf
    int ans = 0;
    for (int i = 0; i < n; i++) {
        ans += ori[i];
        if (ans >= MOD) ans -= MOD;

        for (int len = 1; len < n; len++) {
            if (i < len || (n - i) - 1 < len) break;

            long long left = dp[len + 1][i], right = dp2[len + 1][i];
            ans += ((left * right) % MOD) * ori[i] % MOD;
            if (ans >= MOD) ans -= MOD;
        }
    }
    cout << ans << "\n";

}