Perplexity Pages; inferenceable; The AI Paper Mills Problem
Yep. Back-to-back ThursdAI editions!
The core theme for this one is “research”.
TL;DR
(This is an AI-generated summary of today’s Drop)
- Perplexity Pages: Perplexity has introduced a new “Pages” feature to help users create comprehensive, visually appealing content quickly. This tool allows for the addition of media and customization of layout, aiming to simplify the process of creating and sharing information. Sample Page on Maine
- inferenceable: HyperMink is developing a desktop research tool called HyperMink Desktop, which integrates a locally-stored large language model with the user’s personal data. This tool aims to run LLMs locally, ensuring privacy and enabling personalized, contextual responses. HyperMink Desktop
- The AI Paper Mills Problem: The proliferation of AI-generated papers from “paper mills” is undermining the integrity of scientific research. Wiley has retracted over 11,300 papers and shut down 19 journals due to this issue. Efforts are being made to combat this problem through technological solutions and institutional policies. Wiley Shutters 19 Journals
Perplexity Pages

Perplexity (the thing that, sigh, still powers the Drop summaries) has a new “Pages” feature. It’s designed to “help users create comprehensive, visually appealing content on any topic quickly and efficiently.” In theory, it’s goal is to help simplify the process of creating, organizing, and sharing information, and the folks at Perplexity kind of sum it up as helping you showcase the “value of [your] ideas rather than just [your] writing skills by breaking down complex subjects into digestible pieces.”.
As with all Perplexity-generated content, sources are shown/listed with the generated paragraphs/sections. You can also add media and (slightly) customize the layout of the pages. And, the generated content can be tailored depending on the expertise of your audience.
This is a sample “Page” on my home state, Maine. I let Perplexity create the initial version, then followed one of their suggested prompts — “Youth Development Programs” — where it promptly (heh) added a useless section on youth programs in Mexico. I deleted that section, and tailored the prompt to Maine, and selected a different default section image than the one they provided.
It’s an interesting new feature that’s going to make the lives of teachers/professors that much worse come next Fall, and likely cause problems at work as folks lazily just accept what the tool provides without adding human editing, curation, triple-fact-checking, etc. We humans are inherently lazy creatures.
I also expect this to be used heavily by political parties and adversarial political actors everywhere — including the U.S. — to further help erode democracy and usher in our forthcoming, inevitable, new feudal society. They’ll do this both by generating professional-looking disinformation “Pages” to capture the unwary, and crafting screeds to bolster the basest of the base.
My less cynical take on this is that it will absolutely help get a very rough first draft of a concept, idea, script, etc. cranked out for folks who need to spend a great deal of time to personally work on the details, while helping to remove some of the mundanity of necessary boilerplate baseline/background information.
Between birthdays, graduations, and a much needed Acadia holiday, I doubt I’ll have much time to put Pages through the necessary paces until sometime after mid-June. But, I will definitely do that and report back.
I wonder how soon Perplexica will add a similar opens source feature.
inferenceable

HyperMink is developing a desktop research tool called HyperMink Desktop (GH) that integrates a locally-stored large language model (LLM) with the user’s personal context and data. According to that “about” post, the idea is to:
- run LLMs entirely on our computers without servers or internet connectivity. HyperMink has been doing amazing things in the quant space (which helps these giant beasts run acceptably fast on more common hardware). Running everything locally sounds great, since I’m not thrilled about all the stuff I type into chat boxes, since I don’t trust any of these AI companies to respect privacy posits.
- enable inference on our local files and collections of local and remote webpages to help generate personalized and contextual responses, vs. rely on outdated training data. This is essentially RAG (retrieval augmented generation), but it (hopefully) does all the work for you.
- support both ephemerality (i.e., let you leave no trace of the local inference sessions) and persistence (so, kind of build a local knowledge base).
$ git clone https://github.com/HyperMink/inferenceable.git
$ cd inferenceable
$ npm install
$ npm start
gets you started (it has to download models, so be patient, and try to be on a fast internet link). For me, it grabbed Phi-3-Mini-4k, Llava-Vision-7B. It also figured out I’m on Apple Silicon and tailored the downloads for Metal.
I gave it this week’s Typography Tuesday markdown and asked it to generate a summary for it. This is what I got back O_o:

I noticed in some other output it may expect BEGININPUT and ENDINPUT as markers for pasted-in content so I did that and asked it for a summary (I also changed the name to “Mother” as an homage to the AI on the military spacecraft in the Aliens franchise/universe):

It has a “vision” mode, so I decided to give that a whirl:

It did much better, tho it was kind of a “yawn” answer: “This is a picture of a small bird perched on a branch.”
I’m sure this will get better over time. And, I suspect it does better with models supporting nVidia GPUs. So, we’ll check back in on this and see how it stacks up with Perplexica and some other similar local AI tools.
The AI Paper Mills Problem

Not to be even more hyperbolic in today’s Drop, but there’s a bit of a crisis happening in academic publishing due to the proliferation of AI-generated papers from “paper mills”. This is having the very real effect of undermining the (already decreasing) integrity of scientific research and further crumbling the public’s trust in academia. Increasingly, fraudulent papers are being filled with fabricated results and the tell-tale “tortured phrases” of our LLM/GPT overlords. These (somehow) slip past peer review and get published in respected journals (ref: the aforementioned “we humans are lazy” thing).
Wiley just had to shutter 19 journals due to this [AI] paper mill problem. Over the past two years, they wound up retracting more than 11,300 papers from their Hindawi portfolio. Clarivate has delisted scads more, as have other, similar goups.
While not all paper mills are AI-based, it’s just too easy to use AI to meet the perverse incentives of the present academic publishing ecosystem. United2Act hopes to develop and implement effective strategies to address systematic research fraud perpetrated by [AI] paper mills.
Given the obvious desires of billionaires and autocrats to only care about science that benefits the machine of late-stage capitalism, I’m not hopeful we’ll find techical solutions/tools to help combat this scourge fast enough to counter the the onslaught of fraudulent publications. The incentives for OpenAI, Microsoft, Google, Meta, and others are to help support this destruction vs. combat it (and, only allow what benefits them to make it into public use).
I know I have more than a few readers in academia who review papers (y’all are hereos for that btw). I’m curious what your experiences have been in the past few years since the fateful emergence of GPT-3 in November of 2022. If you’re not comfortable dropping a comment, email or DM/PMs (which I will keep in confidence) are also great.
FIN
Remember, you can follow and interact with the full text of The Daily Drop’s free posts on Mastodon via @dailydrop.hrbrmstr.dev@dailydrop.hrbrmstr.dev ☮️
Leave a comment