Just because it came from university doesn't mean it's "ethical" in the modern copyrightist sense. Academia did not care much about the copyright and scraped, cleaned and used web data at will. It is only now after commercial entities have labour-replacing AIs in the oven that copyright maximalism has gained ground.