“…Not even the latest block detection techniques (see, e.g., [115,110,124,66,123,121,53,81]) implement another block detection phase as a preprocess. Many techniques implement simple preprocess methods such as removing nodes that surely do not have any content to extract (see, e.g., [115,90,110]) or standardizing the HTML code and precleaning it (see, e.g., [105,79]).…”